--- license: apache-2.0 --- Model Card: TERPredictor V1 📌 Model Name TERPredictor V1 – A linear regression model for predicting the Total Expense Ratio (TER) of mutual fund Regular Plans. 📖 Overview TERPredictor V1 is a regression model trained to estimate the 'Regular Plan - Total TER (%)' of mutual funds based on various financial features. It uses a simple linear regression approach and achieves near-perfect performance on the test set. Due to the unusually high accuracy, this model is best suited for exploratory analysis and feature relationship interpretation, rather than generalization to unseen data. 📊 Intended Uses Expense Ratio Estimation: Estimate TER for new or hypothetical mutual fund structures. Outlier Detection: Identify funds with unusually high or low TERs. Feature Impact Analysis: Understand which components most influence TER. 🧠 Model Architecture Attribute Value Model Type Linear Regression Framework scikit-learn Input Features 10 float64 columns Target Variable Regular Plan - Total TER (%) Identifier Dropped Scheme Name (object) 📚 Training Details Dataset Size: 1,622 samples Train/Test Split: 1297 / 325 Missing Values: None Preprocessing: Dropped identifier column (Scheme Name) No normalization required due to linear model simplicity 📈 Evaluation Metrics Metric Value Mean Squared Error (MSE) 0.000001 R-squared (R²) 0.999999 ⚠️ Note: These metrics suggest potential data leakage or a deterministic relationship between features and target. Use with caution. 🚀 How to Use python from terpredictor import TERModel model = TERModel.load_pretrained("your-huggingface-username/terpredictor-v1") input_data = { "feature_1": 0.12, "feature_2": 0.03, ... } predicted_ter = model.predict(input_data) ⚠️ Limitations Potential Data Leakage: Extremely high R² may indicate the target is directly derived from input features. Limited Generalization: Not recommended for predicting TER on unseen or structurally different funds. No Feature Engineering: Model assumes raw features are sufficient. 📄 License MIT License 👤 Author Created by [Your Name or Organization] 📚 Recommendations for Open-Sourcing Include full training code and preprocessing steps Provide detailed explanation of evaluation metrics Add cautionary notes about performance anomalies Consider publishing a cleaned or anonymized version of the dataset