YAML Metadata
Error:
"datasets[2]" with value "samsum_(translated_into_Russian)" is not valid. If possible, use a dataset id from https://hf.co/datasets.
📝 Description
MBart for Russian summarization fine-tuned for dialogues summarization.
This model was firstly fine-tuned by Ilya Gusev on Gazeta dataset. We have fine tuned that model on SamSum dataset translated to Russian using GoogleTranslateAPI
🤗 Moreover! We have implemented a ! telegram bot @summarization_bot ! with the inference of this model. Add it to the chat and get summaries instead of dozens spam messages! 🤗
❓ How to use with code
from transformers import MBartTokenizer, MBartForConditionalGeneration
# Download model and tokenizer
model_name = "Kirili4ik/mbart_ruDialogSum"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = MBartForConditionalGeneration.from_pretrained(model_name)
model.eval()
article_text = "..."
input_ids = tokenizer(
[article_text],
max_length=600,
padding="max_length",
truncation=True,
return_tensors="pt",
)["input_ids"]
output_ids = model.generate(
input_ids=input_ids,
top_k=0,
num_beams=3,
no_repeat_ngram_size=3
)[0]
summary = tokenizer.decode(output_ids, skip_special_tokens=True)
print(summary)
- Downloads last month
- 214
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Evaluation results
- Validation ROGUE-1 on SAMSum Corpus (translated to Russian)self-reported34.500
- Validation ROGUE-L on SAMSum Corpus (translated to Russian)self-reported33.000
- Test ROGUE-1 on SAMSum Corpus (translated to Russian)self-reported31.000
- Test ROGUE-L on SAMSum Corpus (translated to Russian)self-reported28.000