Objects do not disappear

None, None; None, None; None, None; None, None; None, None

Objects do not disappear

Video object detection by single-frame object location anticipation

Conference Paper (2023)

Author(s)

Xin Liu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Jan C. van Gemert (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Fatemeh Karimi Nejadasl (Universiteit van Amsterdam)

Olaf Booij (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Silvia L. Pintea (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group

Pattern Recognition and Bioinformatics

DOI related publication

https://doi.org/10.1109/ICCV51070.2023.00640 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:8b66a4c1-eabe-49a2-8af1-24ddd275ebbb

More Info

expand_more

Publication Year

2023

Language

English

Research Group

Pattern Recognition and Bioinformatics

Pages (from-to)

6927-6938

ISBN (print)

979-8-3503-0719-1

ISBN (electronic)

979-8-3503-0718-4

Event

2023 IEEE/CVF International Conference on Computer Vision (ICCV) (2023-10-01 - 2023-10-06), Paris, France

Downloads counter

222

Collections

Institutional Repository

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by anticipating object locations from a static keyframe. 2) Improved efficiency by only doing the expensive feature computations on a small subset of all frames. Because neighboring video frames are often redundant, we only compute features for a single static keyframe and predict object locations in subsequent frames. 3) Reduced annotation cost, where we only annotate the keyframe and use smooth pseudo-motion between keyframes. We demonstrate computational efficiency, annotation efficiency, and improved mean average precision compared to the state-of-the-art on four datasets: ImageNet VID, EPIC KITCHENS-55, YouTube-BoundingBoxes and Waymo Open dataset. Our source code is available at https://github.com/L-KID/Video-object-detection-by-location-anticipation.

Files

Objects_do_not_disappear_Video... (pdf)

(pdf | 5.06 Mb)

- Embargo expired in 15-07-2024

License info not available