joshuasundance
/

phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc

@@ -1,41 +1,45 @@
 ---
 library_name: transformers
-tags: []
 ---
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
@@ -77,38 +81,36 @@ Use the code below to get started with the model.
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 [More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
 [More Information Needed]
@@ -192,8 +194,8 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 ## Model Card Authors [optional]
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+tags:
+- phi3
+- python
+- dpo
+- mypo
+license: mit
+datasets:
+- joshuasundance/mypo-4k-rfc
+language:
+- en
 ---
 # Model Card for Model ID
+This is an experimental model made by using [https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc](joshuasundance/mypo-4k-rfc) for DPO training of [https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k](edumunozsala/phi3-mini-4k-qlora-python-code-20k).
+The goal is to learn about model training and potentially get the base model to reliably produce Python with type hints.
+I chose `edumunozsala/phi3-mini-4k-qlora-python-code-20k` because I was able to train this model in one hour on my laptop.
 ## Model Details
 ### Model Description
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** Joshua Sundance Bailey
+- **Model type:** phi 3 qlora DPO
+- **Language(s) (NLP):** English
+- **License:** MIT
+- **Finetuned from model [optional]:** [https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k](edumunozsala/phi3-mini-4k-qlora-python-code-20k)
 ### Model Sources [optional]
+- **Training Code:** https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc
 ## Uses
+For evaluation and testing only. Do not expect great results, and do not use this model for anything important. It has not been evaluated in any way after training.
 ### Direct Use
 ### Training Data
+* Original qlora: [https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca](iamtarun/python_code_instructions_18k_alpaca)
+* DPO: [https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc](joshuasundance/mypo-4k-rfc)
 ### Training Procedure
+See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
 #### Preprocessing [optional]
+See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
 #### Training Hyperparameters
+See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
 #### Speeds, Sizes, Times [optional]
+See [https://huggingface.co/joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc/blob/main/trainer_state.json](trainer_state.json) in this repo
 [More Information Needed]
 ## Evaluation
+See [https://huggingface.co/joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc/blob/main/trainer_state.json](trainer_state.json) in this repo
 ### Testing Data, Factors & Metrics
 #### Testing Data
+20% of DPO dataset (see training code)
 [More Information Needed]
 ## Model Card Authors [optional]
+Joshua Sundance Bailey
 ## Model Card Contact
+Joshua Sundance Bailey