Na-Rajan
/

MutualFundTERModel

Model card Files Files and versions

MutualFundTERModel / README.md

Na-Rajan's picture

Update README.md

e5844bd verified about 2 months ago

|

history blame contribute delete

2.42 kB

	---
	license: apache-2.0
	---
	Model Card: TERPredictor V1
	📌 Model Name
	TERPredictor V1 – A linear regression model for predicting the Total Expense Ratio (TER) of mutual fund Regular Plans.

	📖 Overview
	TERPredictor V1 is a regression model trained to estimate the 'Regular Plan - Total TER (%)' of mutual funds based on various financial features. It uses a simple linear regression approach and achieves near-perfect performance on the test set. Due to the unusually high accuracy, this model is best suited for exploratory analysis and feature relationship interpretation, rather than generalization to unseen data.

	📊 Intended Uses
	Expense Ratio Estimation: Estimate TER for new or hypothetical mutual fund structures.

	Outlier Detection: Identify funds with unusually high or low TERs.

	Feature Impact Analysis: Understand which components most influence TER.

	🧠 Model Architecture
	Attribute Value
	Model Type Linear Regression
	Framework scikit-learn
	Input Features 10 float64 columns
	Target Variable Regular Plan - Total TER (%)
	Identifier Dropped Scheme Name (object)
	📚 Training Details
	Dataset Size: 1,622 samples

	Train/Test Split: 1297 / 325

	Missing Values: None

	Preprocessing:

	Dropped identifier column (Scheme Name)

	No normalization required due to linear model simplicity

	📈 Evaluation Metrics
	Metric Value
	Mean Squared Error (MSE) 0.000001
	R-squared (R²) 0.999999
	⚠️ Note: These metrics suggest potential data leakage or a deterministic relationship between features and target. Use with caution.

	🚀 How to Use
	python
	from terpredictor import TERModel

	model = TERModel.load_pretrained("your-huggingface-username/terpredictor-v1")
	input_data = {
	"feature_1": 0.12,
	"feature_2": 0.03,
	...
	}
	predicted_ter = model.predict(input_data)
	⚠️ Limitations
	Potential Data Leakage: Extremely high R² may indicate the target is directly derived from input features.

	Limited Generalization: Not recommended for predicting TER on unseen or structurally different funds.

	No Feature Engineering: Model assumes raw features are sufficient.

	📄 License
	MIT License

	👤 Author
	Created by [Your Name or Organization]

	📚 Recommendations for Open-Sourcing
	Include full training code and preprocessing steps

	Provide detailed explanation of evaluation metrics

	Add cautionary notes about performance anomalies

	Consider publishing a cleaned or anonymized version of the dataset