metadata
base_model: xlm-roberta-base
datasets: CoBaLD/enhanced-cobald
language: en
library_name: transformers
license: gpl-3.0
metrics:
- accuracy
- f1
pipeline_tag: token-classification
tags:
- pytorch
model-index:
- name: CoBaLD/xlm-roberta-base-cobald-parser
results:
- task:
type: token-classification
dataset:
name: enhanced-cobald
type: CoBaLD/enhanced-cobald
split: validation
metrics:
- type: f1
value: 0.9257791579861928
name: Null F1
- type: f1
value: 0.7644881331868549
name: Lemma F1
- type: f1
value: 0.7691162750186747
name: Morphology F1
- type: accuracy
value: 0.8559278875534733
name: Ud Jaccard
- type: accuracy
value: 0.7969800122196037
name: Eud Jaccard
- type: f1
value: 0.9984168766518762
name: Miscs F1
- type: f1
value: 0.6020538395167092
name: Deepslot F1
- type: f1
value: 0.5911474360181621
name: Semclass F1
Model Card for xlm-roberta-base-cobald-parser
A transformer-based multihead parser for CoBaLD annotation.
This model parses a pre-tokenized CoNLL-U text and jointly labels each token with three tiers of tags:
- Grammatical tags (lemma, UPOS, XPOS, morphological features),
- Syntactic tags (basic and enhanced Universal Dependencies),
- Semantic tags (deep slot and semantic class).
Model Sources
- Repository: https://github.com/CobaldAnnotation/CobaldParser
- Paper: https://dialogue-conf.org/wp-content/uploads/2025/04/BaiukIBaiukAPetrovaM.009.pdf
- Demo: [coming soon]
Citation
@inproceedings{baiuk2025cobald,
title={CoBaLD Parser: Joint Morphosyntactic and Semantic Annotation},
author={Baiuk, Ilia and Baiuk, Alexandra and Petrova, Maria},
booktitle={Proceedings of the International Conference "Dialogue"},
volume={I},
year={2025}
}