knn-vis / README.md
rinabuoy's picture
fix
8bf2645
metadata
title: K-Nearest Neighbors Visualizer
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Interactive KNN algorithm visualization

🎯 K-Nearest Neighbors (KNN) Algorithm Visualizer

An interactive educational tool for visualizing and understanding the K-Nearest Neighbors algorithm.

Features

  • Interactive Visualization: Place test points anywhere in the feature space and see real-time KNN classification
  • Multiple Distance Metrics: Support for Euclidean, Manhattan, and Chebyshev distance calculations
  • Comprehensive Display:
    • Main scatter plot showing training data and k-nearest neighbors
    • Distance calculations table with sorted results
    • Detailed statistics and algorithm breakdown
    • Step-by-step distance formula calculations
  • Educational Design: Perfect for learning how KNN works with visual feedback

How to Use

  1. Set Test Point Coordinates: Adjust the X and Y sliders to position your test point
  2. Choose K Value: Select the number of neighbors (1-20) to consider
  3. Select Distance Metric:
    • Euclidean: Standard straight-line distance
    • Manhattan: Sum of absolute differences (city block distance)
    • Chebyshev: Maximum absolute difference
  4. Visualize: The app automatically updates to show classification results

Installation

Local Setup

# Clone the repository
git clone <your-repo-url>
cd knn-vis

# Install dependencies
pip install -r requirements.txt

# Run the app
python app.py

Requirements

  • Python 3.8+
  • gradio
  • numpy
  • matplotlib
  • pillow
  • scipy

Educational Use

This tool is designed for:

  • Machine learning students learning KNN algorithm
  • Instructors teaching classification algorithms
  • Anyone wanting to understand how distance metrics affect predictions
  • Exploring the impact of K value on classification decisions

Dataset

The visualizer uses a synthetic 3-class dataset with:

  • Class 0 (Blue): Clustered in the bottom-left region
  • Class 1 (Red): Clustered in the top-right region
  • Class 2 (Green): Clustered in the top-left region

Each class contains 30 samples with Gaussian distribution.

Algorithm Steps Visualized

  1. Calculate distance from test point to all training points
  2. Sort all points by distance (ascending order)
  3. Select the K nearest points
  4. Count class labels among K neighbors
  5. Predict class with majority vote

Try These Examples

  • Balanced Classification: Test Point (5, 5), K=5
  • Near Class 0: Test Point (2, 2), K=3
  • Near Class 1: Test Point (8, 8), K=7
  • Different Metrics: Compare predictions with different distance metrics

License

Apache 2.0

Author

Created for AI/Machine Learning education (MSAI 515)


Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference