Model Card for Model ID

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: [More Information Needed]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [More Information Needed]

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Agieval

Task	Version	Metric	Value		StdErr
agieval_aqua_rat	0	acc	24.02	_	2.69
agieval_aqua_rat	0	acc_norm	24.02	_	2.69
agieval_logiqa_en	0	acc	23.20	_	1.66
agieval_logiqa_en	0	acc_norm	24.42	_	1.69
agieval_lsat_ar	0	acc	18.26	_	2.55
agieval_lsat_ar	0	acc_norm	18.70	_	2.58
agieval_lsat_lr	0	acc	22.35	_	1.85
agieval_lsat_lr	0	acc_norm	23.53	_	1.88
agieval_lsat_rc	0	acc	20.82	_	2.48
agieval_lsat_rc	0	acc_norm	20.07	_	2.45
agieval_sat_en	0	acc	32.52	_	3.27
agieval_sat_en	0	acc_norm	32.52	_	3.27
agieval_sat_en_without_passage	0	acc	25.73	_	3.05
agieval_sat_en_without_passage	0	acc_norm	24.27	_	2.99
agieval_sat_math	0	acc	25.00	_	2.93
agieval_sat_math	0	acc_norm	20.91	_	2.75
Average: 24.11

GPT4ALL

Task	Version	Metric	Value		StdErr
arc_challenge	0	acc	21.77	_	1.21
arc_challenge	0	acc_norm	24.15	_	1.25
arc_easy	0	acc	37.37	_	0.99
arc_easy	0	acc_norm	36.95	_	0.99
boolq	1	acc	65.60	_	0.83
hellaswag	0	acc	34.54	_	0.47
hellaswag	0	acc_norm	40.54	_	0.49
openbookqa	0	acc	15.00	_	1.59
openbookqa	0	acc_norm	27.40	_	2.00
piqa	0	acc	60.88	_	1.14
piqa	0	acc_norm	60.55	_	1.14
winogrande	0	acc	50.91	_	1.41
Average: 40.01

BigBench

Task	Version	Metric	Value	Std Err
bigbench_causal_judgement	0	MCG	50	2.26
bigbench_date_understanding	0	MCG	49.14	2.18
bigbench_disambiguation_qa	0	MCG	49.31	2.74
bigbench_geometric_shapes	0	MCG	14.18	1.37
bigbench_logical_deduction_5objs	0	MCG	49.41	2.73
bigbench_logical_deduction_7objs	0	MCG	41.48	2.46
bigbench_logical_deduction_3objs	0	MCG	69.33	2.75
bigbench_movie_recommendation	0	MCG	51.71	2.25
bigbench_navigate	0	MCG	50	1.58
bigbench_reasoning_colored_obj	0	MCG	51.92	0.99
bigbench_ruin_names	0	MCG	48.14	2.01
bigbench_salient_trans_err_detec	0	MCG	39.92	1.2
bigbench_snarks	0	MCG	64.14	3.71
bigbench_sports_understanding	0	MCG	55.31	1.59
bigbench_temporal_sequences	0	MCG	46.92	1.4
bigbench_tsk_shuff_objs_5	0	MCG	25.04	1.01
bigbench_tsk_shuff_objs_7	0	MCG	15.04	0.72
bigbench_tsk_shuff_objs_3	0	MCG	55.33	2.75
Average: 44.75

TruthfulQA

Task	Version	Metric	Value	Std Err
truthfulqa_mc	1	mc1	30.11	1.61
truthfulqa_mc	1	mc2	47.69	1.61
Average: 38.90

Openllm Benchmark

Task	Version	Metric	Value		Stderr
arc_challenge	0	acc	40.44	±	1.43
		acc_norm	43.81	±	1.34
hellaswag	0	acc	48.1	±	0.45
		acc_norm	62.73	±	0.32
gsm8k	0	acc	5.6	±	0.6
winogrande	0	acc	60.91	±	1.3
mmlu	0	acc	37.62	±	0.6

Average: 73.5%

TruthfulQA

Task	Version	Metric	Value		Stderr
truthfulqa_mc	1	mc1	29.00	±	1.58
		mc2	45.83	±	1.59