Model Card for Infinitode/TAPM-OPEN-ARC

Repository: https://github.com/Infinitode/OPEN-ARC/

Model Description

OPEN-ARC-TAP is a straightforward XGBClassifier model developed as part of Infinitode's OPEN-ARC initiative. It was developed to assess the probability of traffic accidents by analyzing various external factors.

Architecture:

XGBClassifier: random_state=42, use_label_encoder=False, eval_metric='logloss', colsample_bytree=0.8, learning_rate=0.01, max_depth=5, n_estimators=100, scale_pos_weight=1, subsample=0.8.
Framework: XGBoost
Training Setup: Trained without extra training params.

Uses

Identifying potential accident-prone or high-risk areas.
Enhancing preventive measures for traffic accidents and improving road safety.
Researching traffic safety.

Limitations

May produce implausible or inappropriate results when affected by extreme outlier values.
Might offer inaccurate predictions regarding the likelihood of an accident; caution is recommended when interpreting these outputs.

Training Data

Dataset: Traffic Accident Prediction 💥🚗 dataset from Kaggle.
Source URL: https://www.kaggle.com/datasets/denkuznetz/traffic-accident-prediction
Content: Weather conditions, road types, time of day, and other factors, along with the occurrence or absence of an accident.
Size: 798 entries of traffic data.
Preprocessing: Mapped all string values to numeric values and dropped missing values. SMOTE was used to balance class imbalances.

Training Procedure

Metrics: accuracy, precision, recall, F1, ROC-AUC
Train/Testing Split: 80% train, 20% testing.

Evaluation Results

Metric	Value
Testing Accuracy	85.2%
Testing Weighted Average Precision	87%
Testing Weighted Average Recall	85%
Testing Weighted Average F1	85%
Testing ROC-AUC	82.5%

How to Use

import random

def test_random_samples(model, X_test, y_test, n_samples=5):
    """
    Selects random samples from the test set, makes predictions, and compares with actual values.
    
    Parameters:
    - model: Trained XGBoost classifier.
    - X_test: Feature set for testing.
    - y_test: True labels for testing.
    - n_samples: Number of random samples to test.
    
    Returns:
    None
    """
    # Convert X_test and y_test to DataFrame for easier indexing
    X_test_df = X_test.reset_index(drop=True)
    y_test_df = y_test.reset_index(drop=True)

    # Pick random indices
    random_indices = random.sample(range(len(X_test)), n_samples)
    
    print("Testing on Random Samples:")
    for idx in random_indices:
        sample = X_test_df.iloc[idx]
        true_label = y_test_df.iloc[idx]
        
        # Predict using the model
        prediction = model.predict(sample.values.reshape(1, -1))
        
        # Output results
        print(f"Sample Index: {idx}")
        print(f"Features: {sample.values}")
        print(f"True Label: {true_label}, Predicted Label: {prediction[0]}")
        print("-" * 40)

# Example usage
test_random_samples(xgb, X_test, y_test)

Contact

For questions or issues, open a GitHub issue or reach out at https://infinitode.netlify.app/forms/contact.