BioASQ Yes/No Question Classifier
Model Details
- Model architecture: BERT
- Pretrained base:
microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext
- Fine-tuned on: BioASQ Phase B Yes/No question dataset
- Task type: Binary classification (Yes/No)
- Input format: Concatenated question and supporting context passages
- Output: Probability distribution over two classes ("Yes", "No")
- Tokenizer: Depends on base model (WordPiece or SentencePiece)
Dataset
- Name: BioASQ Task B Phase B Yes/No dataset
- Domain: Biomedical question answering
- Data format: Each sample consists of a yes/no question paired with one or more relevant context snippets extracted from biomedical abstracts
- Split: Standard train/dev split from BioASQ
Performance
Metric |
Value |
Accuracy |
91.44% |
F1 Score |
89.36% |
Evaluation performed on the BioASQ dev set.
Usage Example
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("tmt3103/BioASQ-yesno-PudMedBERT")
model = AutoModelForSequenceClassification.from_pretrained("tmt3103/BioASQ-yesno-PudMedBERT")
def predict_yesno(question: str, context: str) -> str:
inputs = tokenizer(question, context, truncation=True, padding=True, return_tensors="pt")
outputs = model(**inputs)
probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
return "Yes" if probs[0][1] > probs[0][0] else "No"
question = "Does aspirin reduce inflammation?"
context = "Aspirin is widely used as an anti-inflammatory medication in clinical practice."
print(f"Question: {question}\nPredicted answer: {predict_yesno(question, context)}")
Future Work & Maintenance
- Retrain regularly with updated BioASQ datasets to maintain relevance.
- Implement uncertainty estimation for safer decision support.
- Expand to multi-class or multi-label biomedical QA tasks.
- Optimize for deployment efficiency and latency reduction.
Contact & Support
For questions, issues, or collaboration inquiries, please contact:
- Author / Maintainer: Minh Tien
- Email: [email protected]
- GitHub: TMTien31