Add pipeline tag and transformers library (#2)
Browse files- Add pipeline tag and transformers library (494c2c425aad2c656f116760d17ecfe1d36f3ddf)
Co-authored-by: Niels Rogge <[email protected]>
    	
        README.md
    CHANGED
    
    | @@ -1,20 +1,20 @@ | |
| 1 | 
             
            ---
         | 
| 2 | 
            -
            license: apache-2.0
         | 
| 3 | 
            -
            language:
         | 
| 4 | 
            -
            - en
         | 
| 5 | 
             
            base_model:
         | 
| 6 | 
             
            - sfairXC/FsfairX-LLaMA3-RM-v0.1
         | 
|  | |
|  | |
|  | |
| 7 | 
             
            tags:
         | 
| 8 | 
             
            - reward model
         | 
| 9 | 
             
            - fine-grained
         | 
|  | |
|  | |
| 10 | 
             
            ---
         | 
| 11 |  | 
| 12 | 
             
            # MDCureRM
         | 
| 13 |  | 
| 14 | 
            -
             | 
| 15 | 
             
            [π Paper](https://arxiv.org/pdf/2410.23463) | [π€ HF Collection](https://huggingface.co/collections/yale-nlp/mdcure-6724914875e87f41e5445395) | [βοΈ GitHub Repo](https://github.com/yale-nlp/MDCure)
         | 
| 16 |  | 
| 17 | 
            -
             | 
| 18 | 
             
            ## Introduction
         | 
| 19 |  | 
| 20 | 
             
            **MDCure** is an effective and scalable procedure for generating high-quality multi-document (MD) instruction tuning data to improve MD capabilities of LLMs. Using MDCure, we construct a suite of MD instruction datasets complementary to collections such as [FLAN](https://github.com/google-research/FLAN) and fine-tune a variety of already instruction-tuned LLMs from the FlanT5, Qwen2, and LLAMA3.1 model families, up to 70B parameters in size. We additionally introduce **MDCureRM**, an evaluator model specifically designed for the MD setting to filter and select high-quality MD instruction data in a cost-effective, RM-as-a-judge fashion. Extensive evaluations on a wide range of MD and long-context benchmarks spanning various tasks show MDCure consistently improves performance over pre-trained baselines and over corresponding base models by up to 75.5%.
         | 
| @@ -113,10 +113,16 @@ reward_weights = torch.tensor([1/9, 1/9, 1/9, 2/9, 2/9, 2/9], device="cuda") | |
| 113 | 
             
            source_text_1 = ...
         | 
| 114 | 
             
            source_text_2 = ...
         | 
| 115 | 
             
            source_text_3 = ...
         | 
| 116 | 
            -
            context = f"{source_text_1} | 
|  | |
|  | |
|  | |
|  | |
| 117 | 
             
            instruction = "What happened in CHAMPAIGN regarding Lovie Smith and the 2019 defense improvements? Respond with 1-2 sentences."
         | 
| 118 |  | 
| 119 | 
            -
            input_text = f"Instruction: {instruction} | 
|  | |
|  | |
| 120 | 
             
            tokenized_input = tokenizer(
         | 
| 121 | 
             
                                    input_text, 
         | 
| 122 | 
             
                                    return_tensors='pt', 
         | 
| @@ -141,7 +147,7 @@ Beyond MDCureRM, we open-source our best MDCure'd models at the following links: | |
| 141 | 
             
            | **MDCure-Qwen2-1.5B-Instruct**    | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-1.5B-Instruct) | **Qwen2-1.5B-Instruct** fine-tuned with MDCure-72k  |
         | 
| 142 | 
             
            | **MDCure-Qwen2-7B-Instruct**      | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-7B-Instruct) | **Qwen2-7B-Instruct** fine-tuned with MDCure-72k    |
         | 
| 143 | 
             
            | **MDCure-LLAMA3.1-8B-Instruct**   | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-8B-Instruct) | **LLAMA3.1-8B-Instruct** fine-tuned with MDCure-72k  |
         | 
| 144 | 
            -
            | **MDCure-LLAMA3.1-70B-Instruct**  | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-70B-Instruct) | **LLAMA3.1-70B-Instruct** fine-tuned with MDCure- | 
| 145 |  | 
| 146 | 
             
            ## Citation
         | 
| 147 |  | 
|  | |
| 1 | 
             
            ---
         | 
|  | |
|  | |
|  | |
| 2 | 
             
            base_model:
         | 
| 3 | 
             
            - sfairXC/FsfairX-LLaMA3-RM-v0.1
         | 
| 4 | 
            +
            language:
         | 
| 5 | 
            +
            - en
         | 
| 6 | 
            +
            license: apache-2.0
         | 
| 7 | 
             
            tags:
         | 
| 8 | 
             
            - reward model
         | 
| 9 | 
             
            - fine-grained
         | 
| 10 | 
            +
            pipeline_tag: text-ranking
         | 
| 11 | 
            +
            library_name: transformers
         | 
| 12 | 
             
            ---
         | 
| 13 |  | 
| 14 | 
             
            # MDCureRM
         | 
| 15 |  | 
|  | |
| 16 | 
             
            [π Paper](https://arxiv.org/pdf/2410.23463) | [π€ HF Collection](https://huggingface.co/collections/yale-nlp/mdcure-6724914875e87f41e5445395) | [βοΈ GitHub Repo](https://github.com/yale-nlp/MDCure)
         | 
| 17 |  | 
|  | |
| 18 | 
             
            ## Introduction
         | 
| 19 |  | 
| 20 | 
             
            **MDCure** is an effective and scalable procedure for generating high-quality multi-document (MD) instruction tuning data to improve MD capabilities of LLMs. Using MDCure, we construct a suite of MD instruction datasets complementary to collections such as [FLAN](https://github.com/google-research/FLAN) and fine-tune a variety of already instruction-tuned LLMs from the FlanT5, Qwen2, and LLAMA3.1 model families, up to 70B parameters in size. We additionally introduce **MDCureRM**, an evaluator model specifically designed for the MD setting to filter and select high-quality MD instruction data in a cost-effective, RM-as-a-judge fashion. Extensive evaluations on a wide range of MD and long-context benchmarks spanning various tasks show MDCure consistently improves performance over pre-trained baselines and over corresponding base models by up to 75.5%.
         | 
|  | |
| 113 | 
             
            source_text_1 = ...
         | 
| 114 | 
             
            source_text_2 = ...
         | 
| 115 | 
             
            source_text_3 = ...
         | 
| 116 | 
            +
            context = f"{source_text_1}
         | 
| 117 | 
            +
             | 
| 118 | 
            +
            {source_text_2}
         | 
| 119 | 
            +
             | 
| 120 | 
            +
            {source_text_3}"
         | 
| 121 | 
             
            instruction = "What happened in CHAMPAIGN regarding Lovie Smith and the 2019 defense improvements? Respond with 1-2 sentences."
         | 
| 122 |  | 
| 123 | 
            +
            input_text = f"Instruction: {instruction}
         | 
| 124 | 
            +
             | 
| 125 | 
            +
            {context}"
         | 
| 126 | 
             
            tokenized_input = tokenizer(
         | 
| 127 | 
             
                                    input_text, 
         | 
| 128 | 
             
                                    return_tensors='pt', 
         | 
|  | |
| 147 | 
             
            | **MDCure-Qwen2-1.5B-Instruct**    | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-1.5B-Instruct) | **Qwen2-1.5B-Instruct** fine-tuned with MDCure-72k  |
         | 
| 148 | 
             
            | **MDCure-Qwen2-7B-Instruct**      | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-7B-Instruct) | **Qwen2-7B-Instruct** fine-tuned with MDCure-72k    |
         | 
| 149 | 
             
            | **MDCure-LLAMA3.1-8B-Instruct**   | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-8B-Instruct) | **LLAMA3.1-8B-Instruct** fine-tuned with MDCure-72k  |
         | 
| 150 | 
            +
            | **MDCure-LLAMA3.1-70B-Instruct**  | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-70B-Instruct) | **LLAMA3.1-70B-Instruct** fine-tuned with MDCure-72k |
         | 
| 151 |  | 
| 152 | 
             
            ## Citation
         | 
| 153 |  | 

 
		