darkshapes/MIR · Hugging Face

MIR (Machine Intelligence Resource)

A naming schema for AIGC/ML work.

The MIR classification format seeks to standardize and complete a hyperlinked network of model information, improving accessibility and reproducibility across the AI community.
The work is inspired by:

AIR-URN project by CivitAI
Spandrel library's super-resolution registry

Example:

mir : model . transformer . clip-l : stable-diffusion-xl

 mir : model .    lora      .    hyper    :   flux-1
  ↑      ↑         ↑               ↑            ↑
 [URI]:[Domain].[Architecture].[Series]:[Compatibility]

Code for this project can be found at darkshapes/MIR on GitHub

Definitions:

Like other URI schema, the order of the identifiers roughly indicates their specificity from left (broad) to right (narrow)

Domains

dev: Varying local neural network layers, in-training, pre-release, items under evaluation, likely in unexpected formats
model: Static local neural network layers. Publicly released machine learning models with an identifier in the database
operations: Varying global neural network attributes, algorithms, optimizations and procedures on models
info: Static global neural network attributes, metadata with an identifier in the database

Architecture

Broad and general terms for system architectures.

dit: Diffusion transformer, typically Vision Synthesis
unet: Unet diffusion structure
art : Autoregressive transformer, typically LLMs
lora: Low-Rank Adapter (may work with dit or transformer)
vae: Variational Autoencoder etc

Series

Foundational network and technique types.

Compatibility

Implementation details based on version-breaking changes, configuration inconsistencies, or other conflicting indicators that have practical application.

Goals

Standard identification scheme for ALL fields of ML-related development
Simplification of code for model-related logistics
Rapid retrieval of resources and metadata
Efficient and reliable compatibility checks
Organized hyperparameter management

Why not use `diffusion`/`sgm`, `ldm`/`text`/hf.co folder-structure/brand or trade name/preprint paper/development house/algorithm

The format here isnt finalized, but overlapping resource definitions or complicated categories that are difficult to narrow have been pruned

Likewise, definitions that are too specific have also been trimmed

HF.CO become inconsistent across folders/files and often the metadata enforcement of many important developments is neglected

Development credit often shared, Paper heredity tree, super complicated

Algorithms (esp application) are less common knowledge, vague, ~~and I'm too smooth-brain.~~

Overall an attempt at impartiality and neutrality with regards to brand/territory origins

Why `unet`, `dit`, `lora` over alternatives

UNET/DiT/Transformer are shared enough to be genre-ish but not too narrowly specific

Very similar technical process on this level

Functional and efficient for random lookups

Short to type

Roadmap

Decide on @ or : delimiters (like @8cfg for an indistinguishable 8 step lora that requires cfg)

crucial spec element, or an optional, MIR app-determined feature?

Proof of concept generative model registry

Ensure compatability/integration/cross-pollenation with OECD AI Classifications

Ensure compatability/integration/cross-pollenation with NIST AI 200-1 NIST Trustworthy and Responsible AI

massive thank you to @silveroxides for phenomenal work collecting pristine state dicts and related information

darkshapes
/

MIR

MIR (Machine Intelligence Resource)

A naming schema for AIGC/ML work.

mir : model . transformer . clip-l : stable-diffusion-xl

Definitions:

Domains

Architecture

Series

Compatibility

Goals

MIR (Machine Intelligence Resource)A naming schema for AIGC/ML work.

mir : model . transformer . clip-l : stable-diffusion-xl

Definitions:

Domains

Architecture

Series

Compatibility

Goals

MIR (Machine Intelligence Resource)

A naming schema for AIGC/ML work.