MutualFundTERModel / README.md
Na-Rajan's picture
Update README.md
e5844bd verified
---
license: apache-2.0
---
Model Card: TERPredictor V1
πŸ“Œ Model Name
TERPredictor V1 – A linear regression model for predicting the Total Expense Ratio (TER) of mutual fund Regular Plans.
πŸ“– Overview
TERPredictor V1 is a regression model trained to estimate the 'Regular Plan - Total TER (%)' of mutual funds based on various financial features. It uses a simple linear regression approach and achieves near-perfect performance on the test set. Due to the unusually high accuracy, this model is best suited for exploratory analysis and feature relationship interpretation, rather than generalization to unseen data.
πŸ“Š Intended Uses
Expense Ratio Estimation: Estimate TER for new or hypothetical mutual fund structures.
Outlier Detection: Identify funds with unusually high or low TERs.
Feature Impact Analysis: Understand which components most influence TER.
🧠 Model Architecture
Attribute Value
Model Type Linear Regression
Framework scikit-learn
Input Features 10 float64 columns
Target Variable Regular Plan - Total TER (%)
Identifier Dropped Scheme Name (object)
πŸ“š Training Details
Dataset Size: 1,622 samples
Train/Test Split: 1297 / 325
Missing Values: None
Preprocessing:
Dropped identifier column (Scheme Name)
No normalization required due to linear model simplicity
πŸ“ˆ Evaluation Metrics
Metric Value
Mean Squared Error (MSE) 0.000001
R-squared (RΒ²) 0.999999
⚠️ Note: These metrics suggest potential data leakage or a deterministic relationship between features and target. Use with caution.
πŸš€ How to Use
python
from terpredictor import TERModel
model = TERModel.load_pretrained("your-huggingface-username/terpredictor-v1")
input_data = {
"feature_1": 0.12,
"feature_2": 0.03,
...
}
predicted_ter = model.predict(input_data)
⚠️ Limitations
Potential Data Leakage: Extremely high RΒ² may indicate the target is directly derived from input features.
Limited Generalization: Not recommended for predicting TER on unseen or structurally different funds.
No Feature Engineering: Model assumes raw features are sufficient.
πŸ“„ License
MIT License
πŸ‘€ Author
Created by [Your Name or Organization]
πŸ“š Recommendations for Open-Sourcing
Include full training code and preprocessing steps
Provide detailed explanation of evaluation metrics
Add cautionary notes about performance anomalies
Consider publishing a cleaned or anonymized version of the dataset