gair-prox
/

web-doc-refining-lm

Text Generation

text-generation-inference

Model card Files Files and versions

koalazf99 commited on Oct 10, 2024

Commit

4b84804

·

verified ·

1 Parent(s): 5d48286

Update README.md

Files changed (1) hide show

README.md +38 -3

README.md CHANGED Viewed

@@ -1,3 +1,38 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- gair-prox/RedPajama-pro
+language:
+- en
+base_model:
+- gair-prox/RedPJ-ProX-0.3B
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- llama
+- code
+---
+# Web-doc-refining-lm
+<p align="center">
+  <img src="prox-teaser.png">
+</p>
+[ArXiv](http://arxiv.org/abs/2409.17115) | [Code](https://github.com/GAIR-NLP/program-every-example)
+**Web-doc-refining-lm** is an adapted [0.3B-ProX](https://huggingface.co/gair-prox/RedPJ-ProX-0.3B) model, fine-tuned for document level refining via program generation.
+<p align="center">
+  <img src="func_design.png">
+</p>
+### Citation
+```
+@article{zhou2024programming,
+  title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
+  author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
+  journal={arXiv preprint arXiv:2409.17115},
+  year={2024}
+}
+```