Model Card: TERPredictor V1 π Model Name TERPredictor V1 β A linear regression model for predicting the Total Expense Ratio (TER) of mutual fund Regular Plans.
π Overview TERPredictor V1 is a regression model trained to estimate the 'Regular Plan - Total TER (%)' of mutual funds based on various financial features. It uses a simple linear regression approach and achieves near-perfect performance on the test set. Due to the unusually high accuracy, this model is best suited for exploratory analysis and feature relationship interpretation, rather than generalization to unseen data.
π Intended Uses Expense Ratio Estimation: Estimate TER for new or hypothetical mutual fund structures.
Outlier Detection: Identify funds with unusually high or low TERs.
Feature Impact Analysis: Understand which components most influence TER.
π§ Model Architecture Attribute Value Model Type Linear Regression Framework scikit-learn Input Features 10 float64 columns Target Variable Regular Plan - Total TER (%) Identifier Dropped Scheme Name (object) π Training Details Dataset Size: 1,622 samples
Train/Test Split: 1297 / 325
Missing Values: None
Preprocessing:
Dropped identifier column (Scheme Name)
No normalization required due to linear model simplicity
π Evaluation Metrics Metric Value Mean Squared Error (MSE) 0.000001 R-squared (RΒ²) 0.999999 β οΈ Note: These metrics suggest potential data leakage or a deterministic relationship between features and target. Use with caution.
π How to Use python from terpredictor import TERModel
model = TERModel.load_pretrained("your-huggingface-username/terpredictor-v1") input_data = { "feature_1": 0.12, "feature_2": 0.03, ... } predicted_ter = model.predict(input_data) β οΈ Limitations Potential Data Leakage: Extremely high RΒ² may indicate the target is directly derived from input features.
Limited Generalization: Not recommended for predicting TER on unseen or structurally different funds.
No Feature Engineering: Model assumes raw features are sufficient.
π License MIT License
π€ Author Created by [Your Name or Organization]
π Recommendations for Open-Sourcing Include full training code and preprocessing steps
Provide detailed explanation of evaluation metrics
Add cautionary notes about performance anomalies
Consider publishing a cleaned or anonymized version of the dataset