d0rj
/

RuModernBERT-small-rucola

Text Classification

Model card Files Files and versions

RuModernBERT-small-rucola / README.md

d0rj's picture

Update README.md

65a51de verified 6 months ago

|

history blame contribute delete

2.66 kB

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- RussianNLP/rucola
	language:
	- ru
	base_model:
	- deepvk/RuModernBERT-small
	pipeline_tag: text-classification
	metrics:
	- accuracy
	- matthews_correlation
	model-index:
	- name: d0rj/RuModernBERT-small-rucola
	results:
	- task:
	type: text-classification
	dataset:
	name: RussianNLP/rucola
	type: RussianNLP/rucola
	metrics:
	- name: Acc
	type: accuracy
	value: 0.70
	- name: MCC
	type: matthews_correlation
	value: 0.25
	source:
	name: RuCoLA benchmark
	url: https://rucola-benchmark.com/leaderboard?
	---

	# d0rj/RuModernBERT-small-rucola

	## Usage

	Labels: "1" refers to "acceptable", while "0" corresponds to "unacceptable".

	### Simple

	```python
	from transformers import pipeline


	pipe = pipeline('text-classification', model='d0rj/RuModernBERT-small-rucola')
	pipe(["Мне предоставилась возможность все видеть, сам оставаясь незамеченным.", "Весной в лесу очень хорошо"])
	>>> [{'label': 'LABEL_0', 'score': 0.5270525217056274},
	>>> {'label': 'LABEL_1', 'score': 0.923351526260376}]
	```

	### Using weights

	```python
	import torch
	from transformers import AutoModelForSequenceClassification, AutoTokenizer


	model = AutoModelForSequenceClassification.from_pretrained("d0rj/RuModernBERT-small-rucola")
	tokenizer = AutoTokenizer.from_pretrained("d0rj/RuModernBERT-small-rucola")


	@torch.inference_mode()
	def predict(text: str \| list[str], model = model, tokenizer = tokenizer) -> list[int]:
	inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True).to(model.device)
	outputs = model(**inputs)
	probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
	return probs.cpu().argmax(dim=-1).numpy().tolist()


	predict(["Мне предоставилась возможность все видеть, сам оставаясь незамеченным.", "Весной в лесу очень хорошо"])
	>>> [0, 1]
	```

	## Metrics

	\| name \| accuracy \| MCC \| model size, params \|
	\| ---- \| -------- \| --- \| ------------------ \|
	\| [d0rj/RuModernBERT-small-rucola](https://huggingface.co/d0rj/RuModernBERT-small-rucola) \| 0.7 \| 0.25 \| 34.5M \|
	\| [RussianNLP/ruRoBERTa-large-rucola](https://huggingface.co/RussianNLP/ruRoBERTa-large-rucola) \| 0.82 \| 0.56 \| 355M \|

	## Training

	See [raw Weights & Biases logs](https://wandb.ai/d0rj/rucola_small) or [simple report](https://wandb.ai/d0rj/rucola_small/reports/RuModernBERT-small-rucola--VmlldzoxMTcyNDgzNg).