Silent Lip Reader — VSR weights

The visual-speech-recognition (lip-reading) model weights used by the Silent Lip Reader Space. Re-hosted here so that the open-source Space is self-contained and does not break if upstream repos move.

Architecture: Auto-AVSR — ResNet-3D + Conformer encoder, Transformer decoder, joint CTC/attention. Input: 88×88 grayscale mouth crops @ 25fps. Output: text via a 5000-unit SentencePiece (unigram5000) vocabulary. Video-only (no audio path).
Files: pytorch_model.pt (state dict), unigram5000.model, unigram5000_units.txt.

Credits / provenance (please read)

This checkpoint is not trained by the re-host. Honest attribution:

Model architecture & training: Auto-AVSR (Pingchuan Ma et al., "Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels"). All model credit to the original authors.
Checkpoint source: mirrored from AD1TEYA/lip-reading-model on the Hub.
Re-host + the surrounding system, demo, visual-VAD pipeline, evaluation and research: Ahmet Dedeler (🤗 aaahmet).

Intended use

Research and demos of silent visual speech recognition. The weights were trained on LRS3-derived data; treat as research use. Best on clear, frontal, well-articulated English. ~25–30% WER on clean speech, higher on casual speech (lip reading is inherently ambiguous — many phonemes look identical on the lips).

Usage

Used by the Silent Lip Reader Space — record a (silent) video, it crops your mouth, chunks utterances by lip motion, and decodes text. See the Space for the full pipeline and research log.

Built / curated by Ahmet Dedeler — https://ahmetdedeler.com. A credit/link back is appreciated if you use this. License MIT (follows the upstream Space).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Video Classification

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

aaahmet
/

silent-lip-reader-model

Silent Lip Reader — VSR weights

Credits / provenance (please read)

Intended use

Usage

Space using aaahmet/silent-lip-reader-model 1