Safetensors
Serbian
t5
piloT5 / README.md
procesaur's picture
Update README.md
33f7a43 verified
---
license: cc-by-4.0
datasets:
- procesaur/znanje
- procesaur/Vikipedija
- procesaur/Vikizvornik
- procesaur/kisobran
- jerteh/SrpELTeC
language:
- sr
---
<img src="cover.png" class="cover">
<table style="width:100%;height:100%">
<!--tr>
<td colspan=2>
<h4><i class="highlight-container"><b class="highlight">PiloT5</b></i></h4>
</td>
</tr-->
<tr style="width:100%;height:100%">
<td width=50%>
<p>Аутоенкодер заснован на Т5 архитектури - 248 милиона параметара</p>
<p>Обучаван над корпусом српског језика - 4 милијарди речи</p>
<!--p>Једнака подршка уноса на ћирилици и латиници!</p-->
</td>
<td>
<p>T5 based Autoencoder - 248 million parameters</p>
<p>Trained on Serbian corpora - 4 billion words</p>
<!--p>Equal support for Cyrillic and Latin input!</p-->
</td>
</tr>
</table>
```python
>>> from transformers import T5ForConditionalGeneration, T5TokenizerFast
>>> import torch
>>> model = T5ForConditionalGeneration.from_pretrained("te-sla/pilot5")
>>> tokenizer = T5TokenizerFast.from_pretrained("te-sla/pilot5")
>>> text = "ova sekcija sadrži ideje za prioritetne pravce/teme razvoja jezičkih tehnologija (NLP) za srpski jezik. Alternativni pravci razvoja su ukratko pobrojani u odeljku H2."
>>> input = tokenizer(text, return_tensors="pt")
>>> with torch.no_grad():
>>> output = model.generate(input_ids=input["input_ids"], attention_mask=input["attention_mask"], do_sample=False, max_length=512)
>>> decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
>>> print(decoded_output)
```
```python
>>> ova sekcija sadrži ideje za prioritetne pravce/teme razvoja jezičkih tehnologija (NLP) za srpski jezik. Alternativni pravci razvoja su ukratko pobrojani u odeljku H2.
```
<table style="width:100%;height:100%">
<tr>
<td width=50%>
<h5><i><b>Евалуација на задатку сумаризације - српски језик</b></i></h4>
</td>
<td>
<h5><i><b>Evaluation on the summarization task - Serbian language</b></i></h4>
</td>
</tr>
<tr colspan=2 style="width:100%;height:100%">
<td colspan=2 >
<img src="res.png" class="cover" style="max-width:650px">
</td>
</tr>
</table>
<div class="inline-flex flex-col" style="line-height: 1.5;padding-right:50px">
<div style="text-align: center; margin-top: 3px; font-size: 16px; font-weight: 800">Author</div>
<a href="https://huggingface.co/procesaur">
<div class="flex">
<div
style="display:DISPLAY_1; margin-left: auto; margin-right: auto; width: 92px; height:92px; border-radius: 50%;
background-size: cover; background-image: url(&#39;https://cdn-uploads.huggingface.co/production/uploads/1673534533167-63bc254fb8c61b8aa496a39b.jpeg?w=200&h=200&f=face&#39;)">
</div>
</div>
</a>
<div style="text-align: center; font-size: 16px; font-weight: 800">Mihailo Škorić</div>
<div>
<a href="https://huggingface.co/procesaur">
<div style="text-align: center; font-size: 14px;">@procesaur</div>
</a>
</div>
</div>
</div>
<div class="inline-flex flex-col" style="line-height: 1.5;padding-right:40px">
<div style="text-align: center; margin-top: 3px; font-size: 16px; font-weight: 800">Computation</div>
<a href="https://www.ai.gov.rs/">
<div class="flex">
<div
style="display:DISPLAY_1; margin-left: auto; margin-right: auto; width: 92px; height:92px; border-radius: 50%;
background-size: contain; background-image: url(https://www.ai.gov.rs/img/logo_60x120-2.png);background-repeat: no-repeat;
background-position: center;">
</div>
</div>
</a>
<div style="text-align: center; font-size: 16px; font-weight: 800" title="nVidia DGX-zasnovan sistem">Nacionalna AI platforma</div>
<div>
<a href="https://www.ai.gov.rs/">
<div style="text-align: center; font-size: 14px;">ai.gov.rs</div>
</a>
</div>
</div>
</div>
<!--div>
## Cit.
```bibtex
@inproceedings{skorict5,
author = {Mihailo Škorić},
title = {Pilot Text to Text Transfer Transformer Model for Serbian Language},
booktitle = {ARTIFICAL INTELLIGENCE CONFERENCE},
year = {2025},
address = {Belgrade}
publisher = {SASA, Belgrade},
url = {}
}
```
</div-->
<br/>
<br/>
<div id="zastava">
<div class="grb">
<img src="https://www.ai.gov.rs/img/logo_60x120-2.png" style="position:relative; left:30px; z-index:10; height:85px">
</div>
<table width=100% style="border:0px">
<tr style="background-color:#C6363C;width:100%;border:0px;height:30px"><td style="width:100vw"></td></tr>
<tr style="background-color:#0C4076;width:100%;border:0px;height:30px"><td></td></tr>
<tr style="background-color:#ffffff;width:100%;border:0px;height:30px"><td></td></tr>
</table>
</div>
<table style="width:100%;height:100%">
<tr style="width:100%;height:100%">
<td width=50%>
<p>Истраживање jе спроведено уз подршку Фонда за науку Републике Србиjе, #7276, Text Embeddings – Serbian Language Applications – TESLA</p>
</td>
<td>
<p>This research was supported by the Science Fund of the Republic of Serbia, #7276, Text Embeddings - Serbian Language Applications - TESLA</p>
</td>
</tr>
</table>
<style>
.ffeat: {
color:red
}
.cover {
width: 100%;
margin-bottom: 5pt
}
.highlight-container, .highlight {
position: relative;
text-decoration:none
}
.highlight-container {
display: inline-block;
}
.highlight{
color:white;
text-transform:uppercase;
font-size: 16pt;
}
.highlight-container{
padding:5px 10px
}
.highlight-container:before {
content: " ";
display: block;
height: 100%;
width: 100%;
margin-left: 0px;
margin-right: 0px;
position: absolute;
background: #e80909;
transform: rotate(2deg);
top: -1px;
left: -1px;
border-radius: 20% 25% 20% 24%;
padding: 10px 18px 18px 10px;
}
div.grb, #zastava>table {
position:absolute;
top:0px;
left: 0px;
margin:0px
}
div.grb>img, #zastava>table{
margin:0px
}
#zastava {
position: relative;
margin-bottom:120px
}
p {
font-size:14pt
}
</style>