HCAT-FusionNet: Multimodal Preprocessing and Fusion for Survival and Recurrence Prediction,using variational autoencoders and cross-modal attention for holistic healthcare outcome prediction.

This repository contains preprocessing pipelines and training framework for the HANCOCK (Head and Neck Cancer Cohort) dataset used in the Hancothon25 Challenge at MICCAI 2025. The challenge focuses on predicting 5-year survival and 2-year recurrence using multimodal patient data (clinical, pathological, semantic text, spatial histopathology, temporal blood tests).

Our solution introduces novel preprocessing, imputation, and fusion strategies to extract robust 512-dimensional embeddings from heterogeneous modalities, followed by advanced multi-modal training.


Features

  • Clinical Data Preprocessing: Advanced imputation ensemble + VAE-based handling of missing data.
  • Pathological Data Preprocessing: Probabilistic imputation with graph smoothing and 512-d embeddings.
  • Semantic Text Processing: ClinicalBERT / TF-IDF + SVD pipelines for histories, reports, and surgery descriptions.
  • Spatial Histopathology Aggregation: Transformer-based aggregation of patch-level features with spatial awareness.
  • Temporal Blood Data: Physiology-aware normalization, KNN refinement, and LSTM encoder for sequential signals.
  • Fusion Training: Multi-modal VAE with attention-based cross-modal imputation, joint latent space learning, and uncertainty quantification.
  • Evaluation: Binary classification of survival and recurrence, reporting accuracy and F1-score.

Documentation

Detailed explanations of each pipeline are available here:


Results

From the enhanced training pipeline (train2.py), the system achieved:

  • 5-year Survival F1-score: 0.80
  • 2-year Recurrence F1-score: 0.95
  • Average F1-score: 0.875

(See enhanced_hcat_training_summary.json for full training logs and config.)


Methods Summary

  • Imputation: Multi-modal VAE, KNN, PCA, and graph smoothing.
  • Embeddings: Standardized 512-d representations across modalities.
  • Fusion: Attention-based cross-modal integration with uncertainty weighting.
  • Classification: Binary prediction of survival and recurrence with robust evaluation.

Challenge Context

This work addresses the HANCOCK multimodal dataset provided for Hancothon25 (MICCAI 2025). The dataset includes 763 patients with modalities:

  • Clinical structured data
  • Pathology structured data
  • Histopathology WSIs & TMAs
  • Tabular blood test data
  • Free-text clinical/surgery reports

Our framework is designed for precision oncology, enabling predictive modeling for treatment planning and follow-up.


Performance & Insights

  • The system demonstrates strong generalization across modalities.
  • Temporal and pathological modalities improved recurrence prediction.
  • Clinical and semantic features boosted survival classification.
  • Fusion strategies with uncertainty modeling ensured robustness under missing modalities.

Citation

If you use this repository, please cite:

@inproceedings{hcat_fusionnet_2025,
  title={HCAT-FusionNet: Multimodal Preprocessing and Fusion for Survival and Recurrence Prediction},
  author={Ragunath R, Sanjay S, Harish G},
  booktitle={MICCAI Hancothon25 Challenge},
  year={2025}
}

Contact

For questions or collaborations:

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support