Model Card for keerthikoganti/architecture-design-stages-compact-cnn

ArchiTutor is a compact convolutional neural network (CNN) that classifies images

Model Details

Model Description

ArchiTutor is a compact convolutional neural network (CNN) that classifies images of architecture projects into discrete design stages commonly seen in studio workflows: Brainstorm, Design Iteration, Optimization/Detailing, and Final Review/Presentation (class names configurable). The goal is to support design pedagogy and analytics by tagging studio artifacts over time.

Task: Image classification (multi-class)

Inputs: RGB images of architecture artifacts (sketches, diagrams, renders, boards, screenshots)

Outputs: One of the design-stage labels, with class probabilities

Intended audience: Architecture students, instructors, design researchers, education tech tools

  • Developed by: Keerthi Koganti (Carnegie Mellon University)
  • Model type: Compact Convolutional Neural Network (CNN)
  • Language(s) (NLP): English
  • License: MIT

Uses

Direct Use

Auto-tagging student submissions by stage for feedback dashboards

Curating datasets of process images for research on studio workflows

Searching/filtering large archives by stage

Out-of-Scope Use

Not a critique engine; it does not assess design quality

May struggle with ambiguous mixed-stage boards or atypical media (e.g., code screenshots)

Performance depends on domain similarity (studio imagery vs. unrelated graphics)

Bias, Risks, and Limitations

Data imbalance: The dataset may contain more examples of final presentation boards than early sketches or optimization models, biasing predictions toward later stages.

Style bias: If most training images come from specific software (e.g., Rhino/Grasshopper or Revit renderings), the model may underperform on hand drawings, mixed-media collages, or atypical workflows.

Recommendations

Diversify training data: Expand datasets to include hand sketches, BIM screenshots, and diverse cultural/academic styles to reduce bias.

Apply fairness checks: Periodically assess per-class and per-style accuracy metrics to ensure no overfitting to dominant visual tropes.

Document provenance: Keep metadata on dataset sources, creators, and usage consent for transparency.

Avoid high-stakes use: The model should not be used for academic assessment, admissions, or publication decisions.

How to Get Started with the Model

Use the code below to get started with the model.

import torch from torchvision import transforms from PIL import Image from model import load_model # your helper from labels import IDX2LABEL # list or dict mapping

device = "cuda" if torch.cuda.is_available() else "cpu" model = load_model(checkpoint_path="checkpoints/best.pt").to(device).eval()

tfm = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485,0.456,0.406], std=[0.229,0.224,0.225]), ])

img = Image.open("example.jpg").convert("RGB") with torch.no_grad(): logits = model(tfm(img).unsqueeze(0).to(device)) probs = torch.softmax(logits, dim=1).squeeze().cpu().tolist()

pred_idx = int(torch.argmax(logits, dim=1).item()) print(IDX2LABEL[pred_idx], probs[pred_idx])

Training Details

Training Data

Training Procedure

Training Hyperparameters

  • Training regime: Framework: PyTorch

Backbone: Compact CNN (e.g., MobileNetV3-Small or custom ~1–3M params)

Head: Global pooling β†’ Dropout β†’ Linear (num_classes)

Loss: Cross-entropy

Optimizer: AdamW (lr=3e-4, wd=1e-4)

Scheduler: Cosine decay with warmup (e.g., 5 epochs)

Augmentations: RandomResizedCrop(224), RandomHorizontalFlip, small ColorJitter

Batch size / Epochs: [e.g., 64 / 30] (early stopping on val loss)

Mixed precision: Recommended (AMP)

Hardware: [e.g., 1Γ— A100 / 1Γ— RTX 3060]

Reproducibility: Set seeds, log versions (torch, cuda), save train/val metrics

Citation

Gen AI used to made this - ChatGPT and Google Colab

Model Card Contact

Maintainer: Keerthi Koganti

Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using keerthikoganti/architecture-design-stages-compact-cnn 1