visolex
/

phobert-spam-classification

Text Classification

Model card Files Files and versions

phobert-spam-classification / README.md

AnnyNguyen's picture

Update README.md

70f4109 verified 3 months ago

|

history blame contribute delete

1.49 kB

	---
	language: vi
	tags:
	- spam-detection
	- vietnamese
	- phobert
	license: apache-2.0
	datasets:
	- visolex/ViSpamReviews
	metrics:
	- accuracy
	- f1
	model-index:
	- name: phobert-spam-classification
	results:
	- task:
	type: text-classification
	name: Spam Detection (Multi-Class)
	dataset:
	name: ViSpamReviews
	type: custom
	metrics:
	- name: Accuracy
	type: accuracy
	value: <INSERT_ACCURACY>
	- name: F1 Score
	type: f1
	value: <INSERT_F1_SCORE>
	base_model:
	- vinai/phobert-base
	pipeline_tag: text-classification
	---

	# PhoBERT-Spam-MultiClass

	Fine-tuned from [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base) on ViSpamReviews (multi-class).

	* Task: 4-way classification
	* Dataset: [ViSpamReviews](https://huggingface.co/datasets/visolex/ViSpamReviews)
	* Hyperparameters

	* Batch size: 32
	* LR: 3e-5
	* Epochs: 100
	* Max seq len: 256

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	tokenizer = AutoTokenizer.from_pretrained("visolex/phobert-spam-classification")
	model = AutoModelForSequenceClassification.from_pretrained("visolex/phobert-spam-classification")

	text = "Chỉ PR thương hiệu chứ không review thật."
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
	pred = model(**inputs).logits.argmax(dim=-1).item()
	label_map = {0: "NO-SPAM",1: "SPAM-1",2: "SPAM-2",3: "SPAM-3"}
	print(label_map[pred])
	```