---
license: mit
language: en
library_name: pytorch
tags:
- pytorch
- tabular-classification
- pokemon
- finance
- scikit-learn
- shap
---

# Pokémon TCG Price Predictor

This repository contains a PyTorch model trained to analyze Pokemon card features to identify cards with potential for significant price increases.


This model is the backend for the **[PokePrice Gradio Demo](https://huggingface.co/spaces/OffWorldTensor/PokePrice)**. 

## Model Description

The model is a simple Multi-Layer Perceptron (MLP) implemented in PyTorch. It takes various features of a Pokémon card as input—such as its rarity, type, and historical price data—and outputs a single logit. A sigmoid function can be applied to this logit to get a probability score for the price rising.

- **Model type:** Tabular Binary Classification
- **Architecture:** `PricePredictor` (MLP)
- **Framework:** PyTorch
- **Training Data:** A custom dataset derived from the PokemonTCG/pokemon-tcg-data repository, augmented with pricing history.

## How to Use

To use this model, you will need `torch`, `scikit-learn`, `pandas`, and `huggingface_hub`. You can download the model artifacts directly from the Hub.

First, ensure you have `network.py` (which defines the model class) in your working directory.

```python
import torch
import joblib
import json
import pandas as pd
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file

# Make sure you have network.py in the same directory
from network import PricePredictor

REPO_ID = "your-username/pokemon-price-predictor"
MODEL_FILENAME = "model.safetensors"
CONFIG_FILENAME = "config.json"
SCALER_FILENAME = "scaler.pkl"

print("Downloading model files from the Hub...")
model_path = hf_hub_download(repo_id=REPO_ID, filename=MODEL_FILENAME)
config_path = hf_hub_download(repo_id=REPO_ID, filename=CONFIG_FILENAME)
scaler_path = hf_hub_download(repo_id=REPO_ID, filename=SCALER_FILENAME)
print("Downloads complete.")

with open(config_path, "r") as f:
    config = json.load(f)

feature_columns = config["feature_columns"]
input_size = config["input_size"]

model = PricePredictor(input_size=input_size)
model.load_state_dict(load_file(model_path))
model.eval()

scaler = joblib.load(scaler_path)

data_to_predict = {
    'rawPrice': [10.0], 'gradedPriceTen': [100.0], 'gradedPriceNine': [50.0],
}

input_df = pd.DataFrame(data_to_predict)
missing_cols = set(feature_columns) - set(input_df.columns)
for c in missing_cols:
    input_df[c] = 0.0 
input_df = input_df[feature_columns]


input_scaled = scaler.transform(input_df.values)
input_tensor = torch.tensor(input_scaled, dtype=torch.float32)

with torch.no_grad():
    logits = model(input_tensor)
    probability = torch.sigmoid(logits).item()

print(f"\nPrediction for the input card:")
print(f"  - Probability of 30% price rise in 6 months: {probability:.4f}")

if probability > 0.5:
    print("  - Prediction: Price WILL LIKELY rise.")
else:
    print("  - Prediction: Price WILL LIKELY NOT rise.")
```


## Model Explainability

To understand the model's decisions, SHAP (SHapley Additive exPlanations) values were computed.

### Global Feature Importance

This plot shows the average impact of each feature on the model's output magnitude. Features at the top are most influential.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/68b20687b24f311b7de2242d/WMXEn5Ond1zo4B6hvsqLN.png)

## Limitations and Bias

- The model is trained on historical data and may not predict future trends accurately, especially in a volatile market.
- The definition of "price rise" is fixed at 30% over 6 months. The model is not trained for other thresholds or timeframes.
- The dataset may have inherent biases related to card popularity, set releases, or data collection artifacts.

## Author

Callum Anderson