File size: 2,422 Bytes

e5844bd

---
license: apache-2.0
---
Model Card: TERPredictor V1
📌 Model Name
TERPredictor V1 – A linear regression model for predicting the Total Expense Ratio (TER) of mutual fund Regular Plans.

📖 Overview
TERPredictor V1 is a regression model trained to estimate the 'Regular Plan - Total TER (%)' of mutual funds based on various financial features. It uses a simple linear regression approach and achieves near-perfect performance on the test set. Due to the unusually high accuracy, this model is best suited for exploratory analysis and feature relationship interpretation, rather than generalization to unseen data.

📊 Intended Uses
Expense Ratio Estimation: Estimate TER for new or hypothetical mutual fund structures.

Outlier Detection: Identify funds with unusually high or low TERs.

Feature Impact Analysis: Understand which components most influence TER.

🧠 Model Architecture
Attribute	Value
Model Type	Linear Regression
Framework	scikit-learn
Input Features	10 float64 columns
Target Variable	Regular Plan - Total TER (%)
Identifier Dropped	Scheme Name (object)
📚 Training Details
Dataset Size: 1,622 samples

Train/Test Split: 1297 / 325

Missing Values: None

Preprocessing:

Dropped identifier column (Scheme Name)

No normalization required due to linear model simplicity

📈 Evaluation Metrics
Metric	Value
Mean Squared Error (MSE)	0.000001
R-squared (R²)	0.999999
⚠️ Note: These metrics suggest potential data leakage or a deterministic relationship between features and target. Use with caution.

🚀 How to Use
python
from terpredictor import TERModel

model = TERModel.load_pretrained("your-huggingface-username/terpredictor-v1")
input_data = {
    "feature_1": 0.12,
    "feature_2": 0.03,
    ...
}
predicted_ter = model.predict(input_data)
⚠️ Limitations
Potential Data Leakage: Extremely high R² may indicate the target is directly derived from input features.

Limited Generalization: Not recommended for predicting TER on unseen or structurally different funds.

No Feature Engineering: Model assumes raw features are sufficient.

📄 License
MIT License

👤 Author
Created by [Your Name or Organization]

📚 Recommendations for Open-Sourcing
Include full training code and preprocessing steps

Provide detailed explanation of evaluation metrics

Add cautionary notes about performance anomalies

Consider publishing a cleaned or anonymized version of the dataset