Model Card for Infinitode/PSPM-OPEN-ARC

Repository: https://github.com/Infinitode/OPEN-ARC/

Model Description

OPEN-ARC-PSP is a straightforward XGBClassifier model developed as part of Infinitode's OPEN-ARC initiative. It was designed to potentially identify plants experiencing high stress caused by external factors.

Architecture:

XGBClassifier: n_estimators=100, learning_rate=0.1, max_depth=6, subsample=0.8, colsample_bytree=0.8, random_state=42.
Framework: XGBoost
Training Setup: Trained with the default training params.

Uses

Identifying crops experiencing significant stress.
Improving crop production by mitigating major stressors affecting plants.
Performing experimental studies on plant behavior and yield outcomes influenced by stress levels.

Limitations

May generate implausible or inappropriate results when influenced by extreme outlier values.
Could provide inaccurate plant stress levels; caution is advised when relying on these outputs.

Training Data

Dataset: Plant-Health-Data dataset from Kaggle.
Source URL: https://www.kaggle.com/datasets/ziya07/plant-health-data
Content: Soil characteristics, moisture levels, and various agricultural metrics, combined with the anticipated stress level of the plant.
Size: 1200 entries of plant stress levels.
Preprocessing: Dropped unnecessary features like the Timestamp and Plant_ID. Stress levels were manually mapped to three distinct numerical values.

Training Procedure

Metrics: accuracy, precision, recall, F1
Train/Testing Split: 80% train, 20% testing.

Evaluation Results

Metric	Value
Testing Accuracy	99.1%
Testing Weighted Average Precision	99%
Testing Weighted Average Recall	99%
Testing Weighted Average F1	99%

How to Use

import random

def test_random_samples(model, X_test, y_test, n_samples=5):
    """
    Selects random samples from the test set, makes predictions, and compares with actual values.
    
    Parameters:
    - model: Trained XGBoost classifier.
    - X_test: Feature set for testing.
    - y_test: True labels for testing.
    - n_samples: Number of random samples to test.
    
    Returns:
    None
    """
    # Convert X_test and y_test to DataFrame for easier indexing
    X_test_df = X_test.reset_index(drop=True)
    y_test_df = y_test.reset_index(drop=True)

    # Pick random indices
    random_indices = random.sample(range(len(X_test)), n_samples)
    
    print("Testing on Random Samples:")
    for idx in random_indices:
        sample = X_test_df.iloc[idx]
        true_label = y_test_df.iloc[idx]
        
        # Predict using the model
        prediction = model.predict(sample.values.reshape(1, -1))

        # Reverse the health mapping
        reverse_health_mapping = {v: k for k, v in health_mapping.items()}

        # Map true and predicted labels
        true_label_description = reverse_health_mapping[true_label]
        predicted_label_description = reverse_health_mapping[prediction[0]]
        
        # Output results
        print(f"Sample Index: {idx}")
        print(f"Features: {sample.values}")
        print(f"True Label: {true_label}, Predicted Label: {prediction[0]}")
        print(f"True Label (Description): {true_label_description}, Predicted Label (Description): {predicted_label_description}")
        print("-" * 40)

# Example usage
test_random_samples(xgb, X_test, y_test)

Contact

For questions or issues, open a GitHub issue or reach out at https://infinitode.netlify.app/forms/contact.