Team Name

Tahoe-MARS

Members

Anjali Rao
Swati Kaushik
Rishi Verma
Niksa Praljak
Ghulam Murtaza

Project

Title

Predicting effect of drug perturbations on gene expression with a multimodal foundational model

Overview

We developed a deep learning model that predicts gene expression patterns and cellular phenotypic responses (vision scores) based on cancer cell states, small molecule drug compounds, and dosage levels. Our model integrates multimodal representations to accurately forecast how different cell types will respond to various therapeutic agents across a range of concentrations, enabling efficient in silico drug screening.

Motivation

A significant bottleneck in understanding cell states and advancing drug discovery is the prohibitive cost of experimental drug testing. To address this challenge, we introduce Tahoe-MARs, a deep learning model that integrates multimodal foundation models for cell-state and small molecule representations to predict gene expression patterns and cellular phenotypic responses, enabling cost-effective and accurate in silico candidate screening.

Methods

Our MARS (Multimodal Adaptable Response System) model integrates multidimensional cell state and drug representations using a modular architecture. We employ UCE (Unified Cell Embeddings) to encode cancer cell states into latent representations that capture cellular phenotypes. For molecular representation, we utilize ChemBERT to transform chemical structures of drug compounds into meaningful embeddings that preserve pharmacological properties. Dosage information is incorporated via one-hot encoding to ensure the model captures concentration-dependent effects. These representations are jointly processed through an embedding module that aligns the cell and drug feature spaces, followed by a common backbone network that learns cross-modal interactions. The final adapter component integrates all three inputs (cell state, drug properties, and dosage) to predict Vision scores, which quantify gene program expression patterns in response to drug treatments. This architecture enables end-to-end learning of complex relationships between cellular contexts, chemical interventions, and their resulting transcriptional effects.

Results

Our MARS model demonstrates robust generalization capabilities on a challenging test set containing entirely held-out cell states and drug candidates never seen during training. We achieved a strong correlation between predicted and ground truth vision scores with an R² of approximately 0.8, indicating excellent predictive performance across diverse cellular contexts and compounds. Moreover, the model maintained consistent accuracy across gene sets, with an average R² exceeding 0.5 for individual gene program predictions. These results demonstrate that our approach effectively captures the complex relationships between cancer cell states, drug compounds, and their transcriptional effects

Discussion

Our model demonstrates robust predictive capabilities for gene expression levels based on vision scores, even for entirely held-out drug candidates, validating our multimodal approach to in silico drug response prediction. This framework establishes a promising direction for integrating foundation models that leverage both cell-state and small molecule representations to effectively simulate cellular responses to therapeutic interventions. Future work will focus on performance optimization through hyperparameter tuning, incorporating state-of-the-art foundation models, implementing parameter-efficient fine-tuning strategies, and integrating additional contextual metadata such as organ specificity and genetic mutation profiles. Perhaps most excitingly, our approach opens avenues for proposing novel drug candidates not yet experimentally tested, employing active learning and Bayesian optimization techniques to identify compounds with targeted expression profiles, and potentially generating entirely new molecular entities optimized for specific cellular responses.

niksapraljak1
/

tahoe-mars