Update README.md
Browse files
README.md
CHANGED
|
@@ -1,7 +1,9 @@
|
|
| 1 |
---
|
| 2 |
base_model: unsloth/Qwen2.5-7B-Instruct
|
| 3 |
language:
|
| 4 |
-
-
|
|
|
|
|
|
|
| 5 |
license: apache-2.0
|
| 6 |
tags:
|
| 7 |
- text-generation-inference
|
|
@@ -9,14 +11,121 @@ tags:
|
|
| 9 |
- unsloth
|
| 10 |
- qwen2
|
| 11 |
- trl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
---
|
|
|
|
| 13 |
|
| 14 |
-
|
| 15 |
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
- **Finetuned from model :** unsloth/Qwen2.5-7B-Instruct
|
| 19 |
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
base_model: unsloth/Qwen2.5-7B-Instruct
|
| 3 |
language:
|
| 4 |
+
- de
|
| 5 |
+
- fr
|
| 6 |
+
- it
|
| 7 |
license: apache-2.0
|
| 8 |
tags:
|
| 9 |
- text-generation-inference
|
|
|
|
| 11 |
- unsloth
|
| 12 |
- qwen2
|
| 13 |
- trl
|
| 14 |
+
datasets:
|
| 15 |
+
- ipst/slds
|
| 16 |
+
metrics:
|
| 17 |
+
- bertscore
|
| 18 |
+
- bleu
|
| 19 |
+
- rouge
|
| 20 |
---
|
| 21 |
+
# Model Card for Qwen2.5-7B-Instruct-SLDS
|
| 22 |
|
| 23 |
+
## Model Summary
|
| 24 |
|
| 25 |
+
This model is a **Qwen2.5-7B-Instruct fine-tuned on the Swiss Landmark Decisions Summarization (SLDS) dataset**.
|
| 26 |
+
SLDS is a multilingual dataset of **20,000 Swiss Federal Supreme Court decisions** (1954–2024), each paired with **headnotes in German, French, and Italian**, resulting in ~60,000 decision–headnote pairs.
|
|
|
|
| 27 |
|
| 28 |
+
The model is optimized for **legal abstractive summarization** and is capable of producing **concise, legally structured headnotes**.
|
| 29 |
+
It can be used for both **monolingual** and **cross-lingual summarization** tasks.
|
| 30 |
+
|
| 31 |
+
This model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
| 32 |
|
| 33 |
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
| 34 |
+
|
| 35 |
+
---
|
| 36 |
+
|
| 37 |
+
## Intended Use
|
| 38 |
+
|
| 39 |
+
- **Primary Task**: Judicial summarization (decision → headnote generation).
|
| 40 |
+
- **Languages**: German (`de`), French (`fr`), Italian (`it`).
|
| 41 |
+
- **Scenarios**:
|
| 42 |
+
- Monolingual summarization: e.g., German decision → German headnote.
|
| 43 |
+
- Cross-lingual summarization: e.g., German decision → French headnote.
|
| 44 |
+
- Legal research support: assisting in retrieval and navigation of court decisions.
|
| 45 |
+
|
| 46 |
+
**Not intended for**:
|
| 47 |
+
- Replacing human legal expertise.
|
| 48 |
+
- Serving as an authoritative legal source.
|
| 49 |
+
- Automated legal advice or decision-making.
|
| 50 |
+
|
| 51 |
+
---
|
| 52 |
+
|
| 53 |
+
## Training Data
|
| 54 |
+
|
| 55 |
+
- **Dataset**: [Swiss Landmark Decisions Summarization (SLDS)](https://huggingface.co/datasets/ipst/slds).
|
| 56 |
+
- **Size**: ~20K decisions, ~60K decision–headnote pairs.
|
| 57 |
+
- **Splits**: Train (1954–2021), Validation (2022), Test (2023–2024).
|
| 58 |
+
- **Source**: [Swiss Federal Supreme Court](https://www.bger.ch).
|
| 59 |
+
|
| 60 |
+
---
|
| 61 |
+
|
| 62 |
+
## Training Procedure
|
| 63 |
+
|
| 64 |
+
- **Base Models**:
|
| 65 |
+
- Qwen2.5 family (0.5B–14B)
|
| 66 |
+
- Llama 3.2 (3B)
|
| 67 |
+
- Phi-3.5-mini
|
| 68 |
+
|
| 69 |
+
- **Fine-tuning Objective**: Conditional generation (decision → headnote).
|
| 70 |
+
- **Evaluation Metrics**:
|
| 71 |
+
- Lexical: ROUGE-1/2/L, BLEU, BERTScore.
|
| 72 |
+
- Domain-specific: LLM-as-a-Judge framework (DeepSeek V3) assessing five rubrics: accuracy, completeness, clarity, legal citations, and considerations.
|
| 73 |
+
|
| 74 |
+
---
|
| 75 |
+
|
| 76 |
+
## Model Performance
|
| 77 |
+
|
| 78 |
+
On the SLDS test set (2023–2024):
|
| 79 |
+
|
| 80 |
+
| Model | Setting | BERTScore ↑ | BLEU ↑ | ROUGE-1 ↑ | ROUGE-2 ↑ | ROUGE-L ↑ | JUDGE ↑ |
|
| 81 |
+
|:--- |:--- |:--- |:--- |:--- |:--- |:--- |:--- |
|
| 82 |
+
| [Phi-3.5-mini](https://huggingface.co/ipst/Phi-3.5-mini-instruct-SLDS) | fine-tuned | 11.24 ± 3.82 | 34.84 ± 0.41 | 31.20 ± 2.08 | 14.11 ± 1.27 | 20.96 ± 1.35 | 15.25 ± 2.32 |
|
| 83 |
+
| [Llama 3.2B](https://huggingface.co/ipst/Llama-3.2-3B-Instruct-SLDS) | fine-tuned | 15.20 ± 4.40 | 21.89 ± 0.42 | 31.89 ± 2.34 | 14.87 ± 1.61 | 22.49 ± 1.60 | 18.47 ± 2.99 |
|
| 84 |
+
| [Qwen2.5 0.5B](https://huggingface.co/ipst/Qwen2.5-0.5B-Instruct-SLDS) | fine-tuned | -1.37 ± 3.85 | 32.20 ± 0.35 | 23.87 ± 1.68 | 9.46 ± 0.94 | 17.37 ± 1.09 | 5.80 ± 1.26 |
|
| 85 |
+
| [Qwen2.5 1.5B](https://huggingface.co/ipst/Qwen2.5-1.5B-Instruct-SLDS) | fine-tuned | 19.81 ± 2.72 | 36.79 ± 0.34 | 33.03 ± 1.73 | 14.14 ± 1.08 | 22.67 ± 1.13 | 15.92 ± 2.27 |
|
| 86 |
+
| [Qwen2.5 3B](https://huggingface.co/ipst/Qwen2.5-3B-Instruct-SLDS) | fine-tuned | 23.23 ± 2.80 | 38.42 ± 0.34 | 35.18 ± 1.79 | 15.66 ± 1.23 | 24.10 ± 1.17 | 20.31 ± 2.66 |
|
| 87 |
+
| [Qwen2.5 7B](https://huggingface.co/ipst/Qwen2.5-7B-Instruct-SLDS) | fine-tuned | 29.59 ± 1.97 | 41.40 ± 0.34 | 39.24 ± 1.59 | 18.26 ± 1.25 | 26.44 ± 1.15 | 28.37 ± 3.07 |
|
| 88 |
+
| [Qwen2.5 14B](https://huggingface.co/ipst/Qwen2.5-14B-Instruct-SLDS) | fine-tuned | **32.48 ± 1.98** | **41.80 ± 0.37** | 40.04 ± 1.74 | **19.99 ± 1.41** | **28.00 ± 1.28** | 31.38 ± 3.19 |
|
| 89 |
+
| GPT-4o | one-shot | 30.44 ± 1.74 | 31.89 ± 0.25 | **42.12 ± 1.79** | 18.92 ± 1.22 | 25.92 ± 1.05 | 39.70 ± 2.66 |
|
| 90 |
+
| Claude 3.5 Sonnet | one-shot | 5.53 ± 2.00 | 21.88 ± 0.25 | 41.86 ± 1.64 | 19.23 ± 1.19 | 27.67 ± 1.20 | 41.25 ± 2.90 |
|
| 91 |
+
| DeepSeek-R1 | one-shot | 20.28 ± 1.45 | 22.37 ± 0.18 | 38.30 ± 1.82 | 15.97 ± 0.85 | 21.03 ± 0.84 | **42.28 ± 2.21** |
|
| 92 |
+
| o3-mini | one-shot | 14.18 ± 1.31 | 20.55 ± 0.17 | 34.77 ± 1.43 | 11.92 ± 0.69 | 18.21 ± 0.67 | 34.82 ± 2.41 |
|
| 93 |
+
|
| 94 |
+
- **Lexical metrics**: Fine-tuned models outperform in overlap-based scores.
|
| 95 |
+
- **LLM-judge scores**: Larger proprietary and reasoning models outperform in legal precision.
|
| 96 |
+
|
| 97 |
+
---
|
| 98 |
+
|
| 99 |
+
## Limitations
|
| 100 |
+
|
| 101 |
+
- **Language imbalance**: German decisions dominate, while Italian remains underrepresented.
|
| 102 |
+
- **Biases**: Headnotes reflect judicial style and conventions, not neutral summaries.
|
| 103 |
+
- **Evaluation mismatch**: ROUGE and BLEU may not fully capture legal accuracy.
|
| 104 |
+
- **Overfitting risk**: Models may overfit to formulaic headnote structures.
|
| 105 |
+
- **Cross-lingual difficulty**: Some models struggle with non-monolingual headnote generation.
|
| 106 |
+
|
| 107 |
+
---
|
| 108 |
+
|
| 109 |
+
## Ethical Considerations
|
| 110 |
+
|
| 111 |
+
- **Sensitive information**: All data is anonymized by the Swiss Federal Supreme Court before publication.
|
| 112 |
+
- **Legal risk**: Generated headnotes must not be used as official legal advice.
|
| 113 |
+
- **Fair use**: Ensure attribution when reusing outputs.
|
| 114 |
+
|
| 115 |
+
---
|
| 116 |
+
|
| 117 |
+
## How to Cite
|
| 118 |
+
|
| 119 |
+
If you use this model, please cite the dataset paper:
|
| 120 |
+
|
| 121 |
+
```bibtex
|
| 122 |
+
@article{rolshoven2025slds,
|
| 123 |
+
title={Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in Switzerland},
|
| 124 |
+
author={Luca Rolshoven and Vishvaksenan Rasiah and Srinanda Brügger Bose and Sarah Hostettler and Lara Burkhalter and Matthias Stürmer and Joel Niklaus},
|
| 125 |
+
year={2025},
|
| 126 |
+
eprint={2410.13456},
|
| 127 |
+
archivePrefix={arXiv},
|
| 128 |
+
primaryClass={cs.CL},
|
| 129 |
+
url={https://arxiv.org/abs/2410.13456},
|
| 130 |
+
}
|
| 131 |
+
```
|