File size: 7,467 Bytes

---
license: mit
language: tr
tags:
- text-classification
- turkish
- acceptability
- nlp
- galatasaray-university
pipeline_tag: text-classification
---

# helizac/distilbert-pair-acceptability

This model is a fine-tuned version of `dbmdz/distilbert-base-turkish-cased` for classifying the acceptability of a Turkish text output given a Turkish text input.
It was developed as part of the "Evaluation of the Acceptability of Model Outputs" (May 2025).

## Model Description

The model takes a pair of Turkish texts (an "input" and an "output") and predicts whether the "output" is an acceptable response to the "input". Acceptability in this context considers factors like relevance, coherence, and basic linguistic quality (e.g., no severe typos, no nonsensical repetitions, no injected toxicity based on training data). It does not perform deep fact-checking or complex ethical reasoning beyond what was learnable from the synthetic "unacceptable" data.

This model is based on `dbmdz/distilbert-base-turkish-cased` and achieved **79% accuracy** on a manually curated Turkish test set.

## Intended Uses & Limitations

**Intended Use:**
*   As a lightweight filter to quickly assess if a model-generated Turkish output is plausible in response to a given input.
*   To help in curating datasets for training larger language models by identifying potentially problematic input-output pairs.

**Limitations:**
*   The model was trained with a `max_length` of 64 tokens for the combined input and output. Longer texts will be truncated.
*   The "unacceptable" training data was synthetically generated (e.g., typos, toxic word injection, repetitions, mismatched outputs). While effective, it may not cover all nuances of real-world unacceptability (e.g., subtle factual errors, complex irrelevance not captured by simple mismatching).
*   It predicts a binary label ("kabul edilebilir" / "kabul edilemez") and does not provide detailed reasons for unacceptability.
*   Performance is specific to Turkish.

## How to Use

```python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

TOKEN_KEY = "YOUR_HF_TOKEN_HERE" # Replace with your Hugging Face token or set to None
MODEL_NAME = "helizac/distilbert-pair-acceptability"

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
MAX_LENGTH = 64

_tokenizer_cache = {}

def get_tokenizer(tokenizer_name: str, token: str = None):
    if tokenizer_name not in _tokenizer_cache:
        _tokenizer_cache[tokenizer_name] = AutoTokenizer.from_pretrained(tokenizer_name, token=token)
    return _tokenizer_cache[tokenizer_name]

def load_model_and_tokenizer(model_name: str, token: str = None):
    tokenizer = get_tokenizer(model_name, token=token)
    model = AutoModelForSequenceClassification.from_pretrained(model_name, token=token)
    model.to(device)
    model.eval()
    return model, tokenizer

def count_tokens(text: str, tokenizer_name: str, token: str = None, add_special_tokens: bool = False) -> int:
    tokenizer = get_tokenizer(tokenizer_name, token=token)
    encoded_input = tokenizer(text, add_special_tokens=add_special_tokens)
    token_count = len(encoded_input['input_ids'])
    return token_count

def predict_pair_acceptability(input_text: str, output_text: str, model, tokenizer, device, max_length: int):
    model.eval()

    input_tok_count = count_tokens(input_text, tokenizer.name_or_path, token=TOKEN_KEY if TOKEN_KEY else None)
    output_tok_count = count_tokens(output_text, tokenizer.name_or_path, token=TOKEN_KEY if TOKEN_KEY else None)

    if input_tok_count + output_tok_count > max_length - 3: # Max length for content tokens
        print(f"Warning: Input ({input_tok_count}) + Output ({output_tok_count}) tokens might exceed effective max_length ({max_length-3}). Truncation will occur, primarily on output.")

    try:
        encoding = tokenizer(text=input_text, text_pair=output_text, add_special_tokens=True, return_tensors='pt', max_length=max_length, padding='max_length', truncation=True)
        input_ids = encoding['input_ids'].to(device)
        attention_mask = encoding['attention_mask'].to(device)
        token_type_ids = encoding.get('token_type_ids')

        with torch.no_grad():
            if token_type_ids is not None and model.config.model_type not in ['roberta']:
                outputs = model(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids.to(device))
            else:
                outputs = model(input_ids=input_ids, attention_mask=attention_mask)

            logits = outputs.logits
            probs = torch.softmax(logits, dim=-1)
            prediction_index = torch.argmax(probs, dim=1).item()
            confidence = probs[0, prediction_index].item()
            
        label_map = {0: "kabul edilemez", 1: "kabul edilebilir"}
        return label_map[prediction_index], confidence
    except Exception as e:
        print(f"Error during prediction for input '{input_text[:50]}...' / output '{output_text[:50]}...': {e}")
        return f"Error: {e}", 0.0

model, tokenizer = load_model_and_tokenizer(MODEL_NAME, token=TOKEN_KEY)

# Example 1: Acceptable
input_text_1 = "Dün satın aldığım kıyafeti beğendin mi?"
output_text_1 = "Evet, çok güzel!"
prediction_1, confidence_1 = predict_pair_acceptability(input_text_1, output_text_1, model, tokenizer, device, MAX_LENGTH)
print(f"Input: {input_text_1}\nOutput: {output_text_1}\nPrediction: {prediction_1} (Confidence: {confidence_1:.4f})\n")

# Example 2: Unacceptable (irrelevant)
input_text_2 = "Dün satın aldığım kıyafeti beğendin mi?"
output_text_2 = "Elmalar çok güzel!"
prediction_2, confidence_2 = predict_pair_acceptability(input_text_2, output_text_2, model, tokenizer, device, MAX_LENGTH)
print(f"Input: {input_text_2}\nOutput: {output_text_2}\nPrediction: {prediction_2} (Confidence: {confidence_2:.4f})\n")

# Example 3: Unacceptable (grammatically poor)
input_text_3 = "Hayalindeki meslek ne büyük."
output_text_3 = "Olmak ben istemek büyük.
prediction_3, confidence_3 = predict_pair_acceptability(input_text_3, output_text_3, model, tokenizer, device, MAX_LENGTH)
print(f"Input: {input_text_3}\nOutput: {output_text_3}\nPrediction: {prediction_3} (Confidence: {confidence_3:.4f})\n")
```

## Training Data
The model was fine-tuned on a dataset of approximately 460,000 Turkish input-output text pairs.
"Acceptable" pairs (\~132,000) were sourced from various public Turkish NLP datasets.
"Unacceptable" pairs (\~328,000) were synthetically generated by applying rule-based corruptions (typos, toxic word injection, repetition, mismatched outputs) to the acceptable outputs.
All pairs were truncated/padded to a maximum sequence length of 64 tokens for the combined input and output.

## Evaluation Results
On a manually curated, independent Turkish test set (89 pairs evaluated due to token limits), this model (helizac/distilbert-pair-acceptability) achieved an accuracy of 79%.

The stress test for this model showed:

* Average time per example: 0.0040 seconds
* Examples per second (throughput): 250 examples/sec
* (Tested on T4 GPU)

## Citation
This model was developed as part of the following:

Erdi, F. (2025). MODEL ÇIKTILARININ KABUL EDİLEBİLİRLİĞİNİN DEĞERLENDİRİLMESİ (Evaluation of the Acceptability of Model Outputs). T.C Galatasaray Üniversitesi, Mühendislik ve Teknoloji Fakültesi.

Advisor: Dr. Öğr. Üyesi İSMAİL BURAK PARLAK.