--- license: mit language: - en pipeline_tag: tabular-classification tags: - sklearn - classification - iris - tabular datasets: - brjapon/iris metrics: - accuracy library_name: scikit-learn new_version: "v1.0" model-index: - name: Iris Decision Tree results: - task: type: tabular-classification name: Classification metrics: - type: accuracy value: 0.97 name: Test Accuracy --- # Iris Classification Models This repository starts with a **Decision Tree** model trained on the classic **Iris dataset**. The model classifies iris flowers into three species—*setosa*, *versicolor*, or *virginica*—based on four numeric features (sepal length, sepal width, petal length, and petal width). Because of its small size and simplicity, this model is intended primarily for **demonstration and educational** purposes. ## Model Description - **Framework**: [Scikit-Learn](https://scikit-learn.org/stable/) - **Algorithm**: Decision Tree (`DecisionTreeClassifier` class) - **Hyperparameters**: - Defaults for Decision Tree in Scikit-Learn ### Intended Uses - **Education/Proof-of-Concept**: Demonstrates loading a scikit-learn model from the Hugging Face Hub. - **Beginner ML Tutorials**: Introduction to classification tasks, usage of Hugging Face model hosting, and deploying simple demos in Spaces. ### Limitations - **Dataset Size**: The Iris dataset is small (150 samples). Performance metrics may not extrapolate to real-world scenarios. - **Domain Constraints**: The dataset only covers three iris species and may not generalize to other types of flowers. - **Not Production-Ready**: This model is not suited for critical applications (e.g., healthcare, autonomous vehicles). ## How to Use To use this model, you can load the `.joblib` file from the Hub in Python code: ```python import joblib from huggingface_hub import hf_hub_download # Accompanying dataset is hosted in Hugging Face under 'Jesus02/iris-clase' model_path = hf_hub_download(repo_id="brjapon/iris", filename="iris_dt.joblib", repo_type="model") model = joblib.load(model_path) # Example prediction (random values below) sample_input = [[5.1, 3.5, 1.4, 0.2]] prediction = model.predict(sample_input) print(prediction) # e.g., [0] which might correspond to 'setosa' ``` ## Training Procedure - **Training Data**: 80% of the 150-sample Iris dataset (120 samples). - **Validation Data**: 20% (30 samples). - **Steps**: 1. Loaded dataset (obtained from HF repository `brjapon/iris`) 2. Split into training and test sets with `train_test_split` 3. Trained Decision Tree model with default settings 4. Evaluated accuracy on the test set ## Performance Using a random 80/20 split, the model typically achieves **~97%** accuracy on the test subset. Actual results may vary depending on your specific train/test split random state. ## Limitations & Bias - The Iris dataset is not representative of modern, large-scale classification tasks. - Results should not be generalized beyond the included species and scenario.