Token Classification Model

This model was trained to classify tokens in geological texts, identifying entities such as bacias, campos, estruturas físicas, fluidos da Terra, fósseis, minerais, and other domain-specific terms.


Metrics

Evaluation Results

The model was evaluated using precision, recall, and F1-score metrics on a validation dataset. Below are the results per entity:

  • BACIA: Precision: 0.92 | Recall: 0.95 | F1-Score: 0.93 | Support: 581
  • CAMPO: Precision: 0.94 | Recall: 0.84 | F1-Score: 0.89 | Support: 99
  • ESTRUTURA_FISICA: Precision: 0.87 | Recall: 0.84 | F1-Score: 0.85 | Support: 396
  • FLUIDODATERRA: Precision: 0.86 | Recall: 0.85 | F1-Score: 0.85 | Support: 339
  • FOSSEIS: Precision: 0.86 | Recall: 0.76 | F1-Score: 0.81 | Support: 336
  • MINERAIS: Precision: 0.90 | Recall: 0.87 | F1-Score: 0.88 | Support: 217
  • NAO_CONSOLID: Precision: 0.84 | Recall: 0.69 | F1-Score: 0.76 | Support: 131
  • PALEOAMBIENTE: Precision: 0.83 | Recall: 0.69 | F1-Score: 0.75 | Support: 486
  • POÇO: Precision: 0.96 | Recall: 0.94 | F1-Score: 0.95 | Support: 104
  • ROCHA: Precision: 0.91 | Recall: 0.94 | F1-Score: 0.92 | Support: 848
  • TEXTURA: Precision: 0.84 | Recall: 0.72 | F1-Score: 0.78 | Support: 29
  • UNIDADE_CRONO: Precision: 0.94 | Recall: 0.95 | F1-Score: 0.95 | Support: 1119
  • UNIDADE_LITO: Precision: 0.91 | Recall: 0.88 | F1-Score: 0.90 | Support: 468

Aggregated Metrics:

  • Micro Average: Precision: 0.90 | Recall: 0.88 | F1-Score: 0.89 | Support: 5153
  • Macro Average: Precision: 0.89 | Recall: 0.84 | F1-Score: 0.86 | Support: 5153
  • Weighted Average: Precision: 0.90 | Recall: 0.88 | F1-Score: 0.89 | Support: 5153

Intended Use

This model is designed for token classification tasks in geological texts, making it useful for applications in geology-related research, natural language processing (NLP) pipelines, or resource extraction.

Limitations

The model is trained specifically for geological texts and might not perform well for other domains. Fine-tuning may be required for use in other fields.

Downloads last month
31
Safetensors
Model size
559M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hmoreira/petrogeoner-v2-xlm-roberta-large

Finetuned
(426)
this model