Papers
arxiv:2504.08222

F^3Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos

Published on Apr 11
Authors:
,
,
,
,

Abstract

Analyzing Fast, Frequent, and Fine-grained (F^3) events presents a significant challenge in video analytics and multi-modal LLMs. Current methods struggle to identify events that satisfy all the F^3 criteria with high accuracy due to challenges such as motion blur and subtle visual discrepancies. To advance research in video understanding, we introduce F^3Set, a benchmark that consists of video datasets for precise F^3 event detection. Datasets in F^3Set are characterized by their extensive scale and comprehensive detail, usually encompassing over 1,000 event types with precise timestamps and supporting multi-level granularity. Currently, F^3Set contains several sports datasets, and this framework may be extended to other applications as well. We evaluated popular temporal action understanding methods on F^3Set, revealing substantial challenges for existing techniques. Additionally, we propose a new method, F^3ED, for F^3 event detections, achieving superior performance. The dataset, model, and benchmark code are available at https://github.com/F3Set/F3Set.

Community

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2504.08222 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2504.08222 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2504.08222 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.