joshuasundance
/

phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc

@@ -14,11 +14,14 @@ language:
 # Model Card for Model ID
-This is an experimental model made by using [https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc](joshuasundance/mypo-4k-rfc) for DPO training of [https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k](edumunozsala/phi3-mini-4k-qlora-python-code-20k).
-The goal is to learn about model training and potentially get the base model to reliably produce Python with type hints.
-I chose `edumunozsala/phi3-mini-4k-qlora-python-code-20k` because I was able to train this model in one hour on my laptop.
 ## Model Details
@@ -31,7 +34,7 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 - **Model type:** phi 3 qlora DPO
 - **Language(s) (NLP):** English
 - **License:** MIT
-- **Finetuned from model [optional]:** [https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k](edumunozsala/phi3-mini-4k-qlora-python-code-20k)
 ### Model Sources [optional]
@@ -81,30 +84,30 @@ Use the code below to get started with the model.
 ### Training Data
-* Original qlora: [https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca](iamtarun/python_code_instructions_18k_alpaca)
-* DPO: [https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc](joshuasundance/mypo-4k-rfc)
 ### Training Procedure
-See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
 #### Preprocessing [optional]
-See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
 #### Training Hyperparameters
-See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
 #### Speeds, Sizes, Times [optional]
-See [https://huggingface.co/joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc/blob/main/trainer_state.json](trainer_state.json) in this repo
 [More Information Needed]
 ## Evaluation
-See [https://huggingface.co/joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc/blob/main/trainer_state.json](trainer_state.json) in this repo
 ### Testing Data, Factors & Metrics

 # Model Card for Model ID
+* **Base Model**: https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k
+* **Preference Dataset**: https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc
+* **Training Code**: https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc
+* **Training Metrics**: [trainer_state.json](trainer_state.json)
+This is an experimental model made by using `joshuasundance/mypo-4k-rfc` for DPO training of `edumunozsala/phi3-mini-4k-qlora-python-code-20k`.
+The goal is to learn about model training and potentially get the base model to reliably produce Python with type hints. I chose `edumunozsala/phi3-mini-4k-qlora-python-code-20k` because I was able to train this model in one hour on my laptop.
 ## Model Details
 - **Model type:** phi 3 qlora DPO
 - **Language(s) (NLP):** English
 - **License:** MIT
+- **Finetuned from model [optional]:** `edumunozsala/phi3-mini-4k-qlora-python-code-20k`
 ### Model Sources [optional]
 ### Training Data
+* Original qlora: `iamtarun/python_code_instructions_18k_alpaca`
+* DPO: `joshuasundance/mypo-4k-rfc`
 ### Training Procedure
+See training code using `peft`, `transformers`, and `trl`
 #### Preprocessing [optional]
+See training code using `peft`, `transformers`, and `trl`
 #### Training Hyperparameters
+See training code using `peft`, `transformers`, and `trl`
 #### Speeds, Sizes, Times [optional]
+See [trainer_state.json](trainer_state.json) in this repo
 [More Information Needed]
 ## Evaluation
+See [trainer_state.json](trainer_state.json) in this repo
 ### Testing Data, Factors & Metrics