Papers
arxiv:2508.04665

Perch 2.0: The Bittern Lesson for Bioacoustics

Published on Aug 6
Authors:
,
,
,
,
,

Abstract

Perch 2.0, a pre-trained model for bioacoustics, expands to multi-taxa datasets using self-distillation and source-prediction, achieving state-of-the-art performance on benchmarks and outperforming specialized marine models.

AI-generated summary

Perch is a performant pre-trained model for bioacoustics. It was trained in supervised fashion, providing both off-the-shelf classification scores for thousands of vocalizing species as well as strong embeddings for transfer learning. In this new release, Perch 2.0, we expand from training exclusively on avian species to a large multi-taxa dataset. The model is trained with self-distillation using a prototype-learning classifier as well as a new source-prediction training criterion. Perch 2.0 obtains state-of-the-art performance on the BirdSet and BEANS benchmarks. It also outperforms specialized marine models on marine transfer learning tasks, despite having almost no marine training data. We present hypotheses as to why fine-grained species classification is a particularly robust pre-training task for bioacoustics.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.04665 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.04665 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.04665 in a Space README.md to link it from this page.

Collections including this paper 1