Pyrosage pKa_acidic AttentiveFP Model
Model Description
This is an AttentiveFP (Attention-based Fingerprint) Graph Neural Network model trained to predict the acid dissociation constant (pKa) for acidic groups. This property predicts the pH at which acidic functional groups donate protons, affecting ionization state and bioavailability. The model takes SMILES strings as input and uses graph neural networks to predict molecular properties directly from the molecular structure.
Model Details
- Model Type: AttentiveFP (Graph Neural Network)
- Task: Regression
- Input: SMILES strings (molecular representations)
- Output: Continuous numerical value
- Framework: PyTorch Geometric
- Architecture: AttentiveFP with enhanced atom and bond features
Hyperparameters
{
"name": "larger_model",
"hidden_channels": 128,
"num_layers": 3,
"num_timesteps": 3,
"dropout": 0.1,
"learning_rate": 0.0005,
"weight_decay": 0.0001,
"batch_size": 32,
"epochs": 50,
"patience": 10
}
Usage
Installation
pip install torch torch-geometric rdkit-pypi
Loading the Model
import torch
from torch_geometric.nn import AttentiveFP
from rdkit import Chem
from torch_geometric.data import Data
# Load the model
model_dict = torch.load('pytorch_model.pt', map_location='cpu')
state_dict = model_dict['model_state_dict']
hyperparams = model_dict['hyperparameters']
# Create model with correct architecture
model = AttentiveFP(
in_channels=10, # Enhanced atom features
hidden_channels=hyperparams["hidden_channels"],
out_channels=1,
edge_dim=6, # Enhanced bond features
num_layers=hyperparams["num_layers"],
num_timesteps=hyperparams["num_timesteps"],
dropout=hyperparams["dropout"],
)
model.load_state_dict(state_dict)
model.eval()
Making Predictions
def smiles_to_data(smiles):
"""Convert SMILES string to PyG Data object"""
mol = Chem.MolFromSmiles(smiles)
if mol is None:
return None
# Enhanced atom features (10 dimensions)
atom_features = []
for atom in mol.GetAtoms():
features = [
atom.GetAtomicNum(),
atom.GetTotalDegree(),
atom.GetFormalCharge(),
atom.GetTotalNumHs(),
atom.GetNumRadicalElectrons(),
int(atom.GetIsAromatic()),
int(atom.IsInRing()),
# Hybridization as one-hot (3 dimensions)
int(atom.GetHybridization() == Chem.rdchem.HybridizationType.SP),
int(atom.GetHybridization() == Chem.rdchem.HybridizationType.SP2),
int(atom.GetHybridization() == Chem.rdchem.HybridizationType.SP3)
]
atom_features.append(features)
x = torch.tensor(atom_features, dtype=torch.float)
# Enhanced bond features (6 dimensions)
edges_list = []
edge_features = []
for bond in mol.GetBonds():
i = bond.GetBeginAtomIdx()
j = bond.GetEndAtomIdx()
edges_list.extend([[i, j], [j, i]])
features = [
# Bond type as one-hot (4 dimensions)
int(bond.GetBondType() == Chem.rdchem.BondType.SINGLE),
int(bond.GetBondType() == Chem.rdchem.BondType.DOUBLE),
int(bond.GetBondType() == Chem.rdchem.BondType.TRIPLE),
int(bond.GetBondType() == Chem.rdchem.BondType.AROMATIC),
# Additional features (2 dimensions)
int(bond.GetIsConjugated()),
int(bond.IsInRing())
]
edge_features.extend([features, features])
if not edges_list:
return None
edge_index = torch.tensor(edges_list, dtype=torch.long).t()
edge_attr = torch.tensor(edge_features, dtype=torch.float)
return Data(x=x, edge_index=edge_index, edge_attr=edge_attr)
def predict(model, smiles):
"""Make prediction for a SMILES string"""
data = smiles_to_data(smiles)
if data is None:
return None
batch = torch.zeros(data.num_nodes, dtype=torch.long)
with torch.no_grad():
output = model(data.x, data.edge_index, data.edge_attr, batch)
return output.item()
# Example usage
smiles = "CC(=O)OC1=CC=CC=C1C(=O)O" # Aspirin
prediction = predict(model, smiles)
print(f"Prediction for {smiles}: {prediction}")
Training Data
The model was trained on the pKa_acidic dataset from the Pyrosage project, which focuses on molecular toxicity and environmental property prediction.
Model Performance
See training logs for detailed performance metrics.
Limitations
- The model is trained on specific chemical datasets and may not generalize to all molecular types
- Performance may vary for molecules significantly different from the training distribution
- Requires proper SMILES string format for input
Citation
If you use this model, please cite the Pyrosage project:
@misc{pyrosagepka_acidic,
title={Pyrosage pKa_acidic AttentiveFP Model},
author={UPCI NTUA},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/upci-ntua/pyrosage-pka_acidic-attentivefp}
}
License
MIT License - see LICENSE file for details.
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support