EOSDIS Graph Neural Network Model Card
Model Overview
Model Name: EOSDIS-GNN Version: 1.0.3 Type: Heterogeneous Graph Neural Network Framework: PyTorch + PyTorch Geometric Base Language Model: nasa-impact/nasa-smd-ibm-st-v2
Core Components
- Base Text Encoder: NASA-SMD-IBM Language Model (768-dimensional embeddings)
- Graph Neural Network: Heterogeneous GNN with multiple layers
- Node Types: Dataset, Publication, Instrument, Platform, ScienceKeyword
- Edge Types: Multiple relationship types between nodes
Technical Specifications
- Input Dimensions: 768 (NASA-SMD-IBM embeddings)
- Hidden Dimensions: Configurable (default: 256)
- Output Dimensions: 768 (aligned with NASA-SMD-IBM space)
- Number of Layers: Configurable (default: 3)
- Activation Function: ReLU
- Dropout: Applied between layers
Training Details
Training Data
- Source: NASA EOSDIS Knowledge Graph
- Node Types and Counts:
- Datasets: Earth science datasets from NASA DAACs
- Publications: Related scientific papers
- Instruments: Earth observation instruments
- Platforms: Satellite and other observation platforms
- Science Keywords: NASA Earth Science taxonomy
Training Process
- Optimization: Adam optimizer
- Loss Function: Contrastive loss for semantic alignment
- Training Strategy:
- Initial node embedding generation
- Message passing through graph structure
- Contrastive learning with NASA-SMD-IBM embeddings
Intended Use
Designed for: research, data discovery, and semantic search in Earth science
Not intended for: safety‑critical systems or unrelated domains without fine‑tuning
Strengths
Semantic Understanding:
- Strong performance in finding semantically related content
- Effective cross-modal relationships between text and graph structure
Domain Specificity:
- Specialized for Earth science terminology
- Understands relationships between instruments, platforms, and datasets
Multi-modal Integration:
- Combines text-based and graph-based features
- Preserves domain-specific relationships
Limitations
Data Coverage:
- Performance depends on training data coverage
- May have gaps in newer or less documented areas
Computational Requirements:
- Requires significant memory for full graph processing
- Graph operations can be computationally intensive
Domain Constraints:
- Optimized for Earth science domain
- May not generalize well to other domains
Usage Guide
Installation Requirements
pip install torch torch-geometric transformers huggingface-hub
Basic Usage
from transformers import AutoTokenizer, AutoModel
import torch
from gnn_model import EOSDIS_GNN
# Load models
tokenizer = AutoTokenizer.from_pretrained("nasa-impact/nasa-smd-ibm-st-v2")
text_model = AutoModel.from_pretrained("nasa-impact/nasa-smd-ibm-st-v2")
gnn_model = EOSDIS_GNN.from_pretrained("your-username/eosdis-gnn")
# Process query
def get_embedding(text):
inputs = tokenizer(text, return_tensors="pt", max_length=512,
truncation=True, padding=True)
with torch.no_grad():
outputs = text_model(**inputs)
return outputs.last_hidden_state[:, 0, :]
Semantic Search Example
from semantic_search import SemanticSearch
# Initialize searcher
searcher = SemanticSearch()
# Perform search
results = searcher.search(
query="atmospheric carbon dioxide measurements",
top_k=5,
node_type="Dataset" # Optional: filter by node type
)
Evaluation Metrics
Performance
Metric | Value | Notes |
---|---|---|
Top‑5 Accuracy | 87.4% | Probability that at least one of the top‑5 retrieved nodes is relevant. |
Mean Reciprocal Rank (MRR) | 0.73 | Measures ranking quality. |
Link Prediction ROC‑AUC | 0.91 | Ability to predict whether a given edge exists. |
Node Classification F1 (macro) | 0.84 | Balanced accuracy across node types. |
Triple Classification Accuracy | 88.6% | Accuracy in classifying valid vs. invalid triples. |
Evaluation Notes:
- Dataset: held‑out portion of NASA EOSDIS Knowledge Graph
- Search task: queries derived from publication abstracts
- Link prediction: 80/10/10 train/val/test splits
- Numbers from offline evaluation; may vary on different graph snapshots
Version Control
- Model versions tracked on Hugging Face Hub
- Regular updates for improved performance
Citation
@misc{armin_mehrabian_2025,
author = { Armin Mehrabian },
title = { nasa-eosdis-heterogeneous-gnn (Revision 7e71e62) },
year = 2025,
url = { https://huggingface.co/arminmehrabian/nasa-eosdis-heterogeneous-gnn },
doi = { 10.57967/hf/6071 },
publisher = { Hugging Face }
}
Contact Information
- Maintainer: Armin Mehrabian
- Email: [email protected]
- Organization: NASA
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for arminmehrabian/nasa-eosdis-heterogeneous-gnn
Base model
nasa-impact/nasa-smd-ibm-st-v2