AntiFP2: Fine-tuned ESM2 Antifungal Protein Classifier

This repository contains a fine-tuned ESM2 model for classifying antifungal proteins from amino acid sequences. The model is trained to predict binary labels indicating whether a protein is antifungal or not.

Model Description

  • Base Model: ESM2-t36-3B-UR50D (Fine-tuned)
  • Fine-tuning Task: Binary antifungal protein classification.
  • Architecture: ESM2 backbone with a linear classification head.
  • Input: Protein amino acid sequences.
  • Output: Binary labels (0 = non-antifungal, 1 = antifungal).

Repository Contents

  • pytorch_model.bin: Trained model weights.
  • alphabet.bin: ESM2 alphabet (tokenizer).
  • config.json: Model configuration.
  • README.md: This file.

Usage

Installation

Install required Python packages:

pip install torch esm biopython huggingface_hub

Loading the Model from Hugging Face

import torch
import torch.nn as nn
import esm
from huggingface_hub import hf_hub_download
import json

# Define the classifier architecture (must match training)
class ProteinClassifier(nn.Module):
    def __init__(self, esm_model, embedding_dim, num_classes):
        super(AntiFP2Classifier, self).__init__()
        self.esm_model = esm_model
        self.fc = nn.Linear(embedding_dim, num_classes)
    def forward(self, tokens):
        with torch.no_grad():
            results = self.esm_model(tokens, repr_layers=[36])
        embeddings = results["representations"][36].mean(1)
        output = self.fc(embeddings)
        return output

# Download model files from Hugging Face Hub
repo_id = "your-username/antifp2"
model_weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
alphabet_path = hf_hub_download(repo_id=repo_id, filename="alphabet.bin")
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")

# Load ESM2 backbone model
esm_model, alphabet = esm.pretrained.esm2_t36_3B_UR50D()

# Load configuration
with open(config_path, 'r') as f:
    config = json.load(f)

# Initialize classifier
classifier = ProteinClassifier(esm_model, embedding_dim=config['embedding_dim'], num_classes=config['num_classes'])

# Load weights
classifier.load_state_dict(torch.load(model_weights_path))
classifier.eval()

# Load alphabet tokenizer
alphabet = torch.load(alphabet_path)
batch_converter = alphabet.get_batch_converter()

# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
classifier = classifier.to(device)

Input Format

Input sequences must be provided as amino acid strings using standard single-letter codes.

Output

The model outputs logits for two classes, which can be converted to probabilities using softmax. The predicted label is antifungal (1) if the probability exceeds a threshold (e.g., 0.5).


Downloads last month
36
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for raghavagps-group/antifp2

Finetuned
(20)
this model