Details
| Originalsprache | Englisch |
|---|---|
| Titel des Sammelwerks | 2025 IEEE International Conference on Image Processing, ICIP 2025 - Proceedings |
| Herausgeber (Verlag) | IEEE Computer Society |
| Seiten | 1996-2001 |
| Seitenumfang | 6 |
| ISBN (elektronisch) | 9798331523794 |
| ISBN (Print) | 979-8-3315-2380-0 |
| Publikationsstatus | Veröffentlicht - 14 Sept. 2025 |
| Veranstaltung | 32nd IEEE International Conference on Image Processing, ICIP 2025 - Anchorage, USA / Vereinigte Staaten Dauer: 14 Sept. 2025 → 17 Sept. 2025 |
Publikationsreihe
| Name | Proceedings - International Conference on Image Processing, ICIP |
|---|---|
| ISSN (Print) | 1522-4880 |
Abstract
In this work, we present a learned multi-task video codec that is optimized for human and machine vision. The codec consists of an encoder that maps images from the pixel domain to a latent representation and multiple decoders that map the latent to either an image for human consumption or multiple task-specific features for different machine vision tasks. This allows a single bitstream to be used for multiple tasks while also reducing the decoder complexity for machine vision tasks. Unlike most learned codecs, our method performs inter-coding at the latent level instead of the pixel domain. Experiments show that the proposed method achieves a compression performance for machine vision tasks comparable to other multi-task codecs designed for machine vision only, while also providing video reconstruction.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Software
- Informatik (insg.)
- Signalverarbeitung
- Informatik (insg.)
- Maschinelles Sehen und Mustererkennung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
2025 IEEE International Conference on Image Processing, ICIP 2025 - Proceedings. IEEE Computer Society, 2025. S. 1996-2001 (Proceedings - International Conference on Image Processing, ICIP).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Learned Hybrid Video Coding for Human Perception and Multiple Machine Vision Tasks
AU - Benjak, Martin
AU - Khan, Saifullah
AU - Chen, Yi Hsin
AU - Peng, Wen Hsiao
AU - Ostermann, Jörn
N1 - Publisher Copyright: ©2025 IEEE.
PY - 2025/9/14
Y1 - 2025/9/14
N2 - In this work, we present a learned multi-task video codec that is optimized for human and machine vision. The codec consists of an encoder that maps images from the pixel domain to a latent representation and multiple decoders that map the latent to either an image for human consumption or multiple task-specific features for different machine vision tasks. This allows a single bitstream to be used for multiple tasks while also reducing the decoder complexity for machine vision tasks. Unlike most learned codecs, our method performs inter-coding at the latent level instead of the pixel domain. Experiments show that the proposed method achieves a compression performance for machine vision tasks comparable to other multi-task codecs designed for machine vision only, while also providing video reconstruction.
AB - In this work, we present a learned multi-task video codec that is optimized for human and machine vision. The codec consists of an encoder that maps images from the pixel domain to a latent representation and multiple decoders that map the latent to either an image for human consumption or multiple task-specific features for different machine vision tasks. This allows a single bitstream to be used for multiple tasks while also reducing the decoder complexity for machine vision tasks. Unlike most learned codecs, our method performs inter-coding at the latent level instead of the pixel domain. Experiments show that the proposed method achieves a compression performance for machine vision tasks comparable to other multi-task codecs designed for machine vision only, while also providing video reconstruction.
KW - feature compression
KW - video coding
KW - Video coding for machines
UR - http://www.scopus.com/inward/record.url?scp=105028627594&partnerID=8YFLogxK
U2 - 10.1109/ICIP55913.2025.11084300
DO - 10.1109/ICIP55913.2025.11084300
M3 - Conference contribution
AN - SCOPUS:105028627594
SN - 979-8-3315-2380-0
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 1996
EP - 2001
BT - 2025 IEEE International Conference on Image Processing, ICIP 2025 - Proceedings
PB - IEEE Computer Society
T2 - 32nd IEEE International Conference on Image Processing, ICIP 2025
Y2 - 14 September 2025 through 17 September 2025
ER -