YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
GPT-QModel quantized: GPT-QModel
| Metric | MARLIN |
|--------------------------------|----------|
| arc_challenge :: acc,none | 0.5154 |
| arc_challenge :: acc_norm,none | 0.535 |
| mmlu :: acc,none | 0.6325 |
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|----------------------------------------|------:|------|-----:|--------|---|-----:|---|-----:|
|arc_challenge | 1|none | 0|acc |↑ |0.5154|± |0.0146|
| | |none | 0|acc_norm|↑ |0.5350|± |0.0146|
|mmlu | 2|none | |acc |↑ |0.6325|± |0.0038|
|mmlu_humanities | 2|none | |acc |↑ |0.5581|± |0.0066|
|mmlu_formal_logic | 1|none | 0|acc |↑ |0.3889|± |0.0436|
|mmlu_high_school_european_history | 1|none | 0|acc |↑ |0.8303|± |0.0293|
|mmlu_high_school_us_history | 1|none | 0|acc |↑ |0.8431|± |0.0255|
|mmlu_high_school_world_history | 1|none | 0|acc |↑ |0.8608|± |0.0225|
|mmlu_international_law | 1|none | 0|acc |↑ |0.7521|± |0.0394|
|mmlu_jurisprudence | 1|none | 0|acc |↑ |0.7130|± |0.0437|
|mmlu_logical_fallacies | 1|none | 0|acc |↑ |0.7853|± |0.0323|
|mmlu_moral_disputes | 1|none | 0|acc |↑ |0.6647|± |0.0254|
|mmlu_moral_scenarios | 1|none | 0|acc |↑ |0.2615|± |0.0147|
|mmlu_philosophy | 1|none | 0|acc |↑ |0.7138|± |0.0257|
|mmlu_prehistory | 1|none | 0|acc |↑ |0.6975|± |0.0256|
|mmlu_professional_law | 1|none | 0|acc |↑ |0.4700|± |0.0127|
|mmlu_world_religions | 1|none | 0|acc |↑ |0.7895|± |0.0313|
|mmlu_other | 2|none | |acc |↑ |0.6968|± |0.0080|
|mmlu_business_ethics | 1|none | 0|acc |↑ |0.6900|± |0.0465|
|mmlu_clinical_knowledge | 1|none | 0|acc |↑ |0.7094|± |0.0279|
|mmlu_college_medicine | 1|none | 0|acc |↑ |0.6590|± |0.0361|
|mmlu_global_facts | 1|none | 0|acc |↑ |0.4100|± |0.0494|
|mmlu_human_aging | 1|none | 0|acc |↑ |0.6861|± |0.0311|
|mmlu_management | 1|none | 0|acc |↑ |0.8058|± |0.0392|
|mmlu_marketing | 1|none | 0|acc |↑ |0.8889|± |0.0206|
|mmlu_medical_genetics | 1|none | 0|acc |↑ |0.6300|± |0.0485|
|mmlu_miscellaneous | 1|none | 0|acc |↑ |0.7854|± |0.0147|
|mmlu_nutrition | 1|none | 0|acc |↑ |0.6993|± |0.0263|
|mmlu_professional_accounting | 1|none | 0|acc |↑ |0.5142|± |0.0298|
|mmlu_professional_medicine | 1|none | 0|acc |↑ |0.7059|± |0.0277|
|mmlu_virology | 1|none | 0|acc |↑ |0.4819|± |0.0389|
|mmlu_social_sciences | 2|none | |acc |↑ |0.7520|± |0.0076|
|mmlu_econometrics | 1|none | 0|acc |↑ |0.4737|± |0.0470|
|mmlu_high_school_geography | 1|none | 0|acc |↑ |0.8687|± |0.0241|
|mmlu_high_school_government_and_politics| 1|none | 0|acc |↑ |0.8756|± |0.0238|
|mmlu_high_school_macroeconomics | 1|none | 0|acc |↑ |0.7256|± |0.0226|
|mmlu_high_school_microeconomics | 1|none | 0|acc |↑ |0.8193|± |0.0250|
|mmlu_high_school_psychology | 1|none | 0|acc |↑ |0.8587|± |0.0149|
|mmlu_human_sexuality | 1|none | 0|acc |↑ |0.7176|± |0.0395|
|mmlu_professional_psychology | 1|none | 0|acc |↑ |0.6520|± |0.0193|
|mmlu_public_relations | 1|none | 0|acc |↑ |0.6364|± |0.0461|
|mmlu_security_studies | 1|none | 0|acc |↑ |0.7102|± |0.0290|
|mmlu_sociology | 1|none | 0|acc |↑ |0.7662|± |0.0299|
|mmlu_us_foreign_policy | 1|none | 0|acc |↑ |0.8200|± |0.0386|
|mmlu_stem | 2|none | |acc |↑ |0.5636|± |0.0085|
|mmlu_abstract_algebra | 1|none | 0|acc |↑ |0.3400|± |0.0476|
|mmlu_anatomy | 1|none | 0|acc |↑ |0.6148|± |0.0420|
|mmlu_astronomy | 1|none | 0|acc |↑ |0.6513|± |0.0388|
|mmlu_college_biology | 1|none | 0|acc |↑ |0.7778|± |0.0348|
|mmlu_college_chemistry | 1|none | 0|acc |↑ |0.4500|± |0.0500|
|mmlu_college_computer_science | 1|none | 0|acc |↑ |0.5500|± |0.0500|
|mmlu_college_mathematics | 1|none | 0|acc |↑ |0.3800|± |0.0488|
|mmlu_college_physics | 1|none | 0|acc |↑ |0.3824|± |0.0484|
|mmlu_computer_security | 1|none | 0|acc |↑ |0.7900|± |0.0409|
|mmlu_conceptual_physics | 1|none | 0|acc |↑ |0.6383|± |0.0314|
|mmlu_electrical_engineering | 1|none | 0|acc |↑ |0.5931|± |0.0409|
|mmlu_elementary_mathematics | 1|none | 0|acc |↑ |0.4550|± |0.0256|
|mmlu_high_school_biology | 1|none | 0|acc |↑ |0.8000|± |0.0228|
|mmlu_high_school_chemistry | 1|none | 0|acc |↑ |0.6158|± |0.0342|
|mmlu_high_school_computer_science | 1|none | 0|acc |↑ |0.6600|± |0.0476|
|mmlu_high_school_mathematics | 1|none | 0|acc |↑ |0.3519|± |0.0291|
|mmlu_high_school_physics | 1|none | 0|acc |↑ |0.4834|± |0.0408|
|mmlu_high_school_statistics | 1|none | 0|acc |↑ |0.5741|± |0.0337|
|mmlu_machine_learning | 1|none | 0|acc |↑ |0.4821|± |0.0474|
| Groups |Version|Filter|n-shot|Metric| |Value | |Stderr|
|--------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu | 2|none | |acc |↑ |0.6325|± |0.0038|
|mmlu_humanities | 2|none | |acc |↑ |0.5581|± |0.0066|
|mmlu_other | 2|none | |acc |↑ |0.6968|± |0.0080|
|mmlu_social_sciences| 2|none | |acc |↑ |0.7520|± |0.0076|
|mmlu_stem | 2|none | |acc |↑ |0.5636|± |0.0085|
- Downloads last month
- 38