SparseAlign: A Fully Sparse Framework for Cooperative Object Detection

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

Externe Organisationen

  • Technische Universität München (TUM)
  • Munich Center for Machine Learning (MCML)
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des Sammelwerks2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Seiten22296-22305
Seitenumfang10
ISBN (elektronisch)979-8-3315-4365-5
PublikationsstatusVeröffentlicht - 10 Juni 2025

Publikationsreihe

NameCVPR
ISSN (elektronisch)2575-7075

Abstract

Cooperative perception can increase the view field and decrease the occlusion of an ego vehicle, hence improving the perception performance and safety of autonomous driving. Despite the success of previous works on cooperative object detection, they mostly operate on dense Bird's Eye View (BEV) feature maps, which are computationally demanding and can hardly be extended to long-range detection problems. More efficient fully sparse frameworks are rarely explored. In this work, we design a fully sparse framework, SparseAlign, with three key features: an enhanced sparse 3D backbone, a query-based temporal context learning module, and a robust detection head specially tailored for sparse features. Extensive experimental results on both OPV2V and DairV2X datasets show that our framework, despite its sparsity, outperforms the state of the art with less communication bandwidth requirements. In addition, experiments on the OPV2Vt and DairV2Xt datasets for time-aligned cooperative object detection also show a significant performance gain compared to the baseline works.

ASJC Scopus Sachgebiete

Zitieren

SparseAlign: A Fully Sparse Framework for Cooperative Object Detection. / Yuan, Yunshuang; Xia, Yan; Cremers, Daniel et al.
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2025. S. 22296-22305 (CVPR).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Yuan, Y, Xia, Y, Cremers, D & Sester, M 2025, SparseAlign: A Fully Sparse Framework for Cooperative Object Detection. in 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). CVPR, S. 22296-22305. https://doi.org/10.1109/CVPR52734.2025.02077, https://doi.org/10.48550/arXiv.2503.12982
Yuan, Y., Xia, Y., Cremers, D., & Sester, M. (2025). SparseAlign: A Fully Sparse Framework for Cooperative Object Detection. In 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (S. 22296-22305). (CVPR). https://doi.org/10.1109/CVPR52734.2025.02077, https://doi.org/10.48550/arXiv.2503.12982
Yuan Y, Xia Y, Cremers D, Sester M. SparseAlign: A Fully Sparse Framework for Cooperative Object Detection. in 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2025. S. 22296-22305. (CVPR). doi: 10.1109/CVPR52734.2025.02077, 10.48550/arXiv.2503.12982
Yuan, Yunshuang ; Xia, Yan ; Cremers, Daniel et al. / SparseAlign : A Fully Sparse Framework for Cooperative Object Detection. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2025. S. 22296-22305 (CVPR).
Download
@inproceedings{c10f90057d224336a895bd0f17273500,
title = "SparseAlign: A Fully Sparse Framework for Cooperative Object Detection",
abstract = "Cooperative perception can increase the view field and decrease the occlusion of an ego vehicle, hence improving the perception performance and safety of autonomous driving. Despite the success of previous works on cooperative object detection, they mostly operate on dense Bird's Eye View (BEV) feature maps, which are computationally demanding and can hardly be extended to long-range detection problems. More efficient fully sparse frameworks are rarely explored. In this work, we design a fully sparse framework, SparseAlign, with three key features: an enhanced sparse 3D backbone, a query-based temporal context learning module, and a robust detection head specially tailored for sparse features. Extensive experimental results on both OPV2V and DairV2X datasets show that our framework, despite its sparsity, outperforms the state of the art with less communication bandwidth requirements. In addition, experiments on the OPV2Vt and DairV2Xt datasets for time-aligned cooperative object detection also show a significant performance gain compared to the baseline works.",
keywords = "cs.CV, data fusion, cooperative perception, point cloud, autonomous driving",
author = "Yunshuang Yuan and Yan Xia and Daniel Cremers and Monika Sester",
note = "Publisher Copyright: {\textcopyright} 2025 IEEE.",
year = "2025",
month = jun,
day = "10",
doi = "10.1109/CVPR52734.2025.02077",
language = "English",
isbn = "979-8-3315-4364-8",
series = "CVPR",
pages = "22296--22305",
booktitle = "2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)",

}

Download

TY - GEN

T1 - SparseAlign

T2 - A Fully Sparse Framework for Cooperative Object Detection

AU - Yuan, Yunshuang

AU - Xia, Yan

AU - Cremers, Daniel

AU - Sester, Monika

N1 - Publisher Copyright: © 2025 IEEE.

PY - 2025/6/10

Y1 - 2025/6/10

N2 - Cooperative perception can increase the view field and decrease the occlusion of an ego vehicle, hence improving the perception performance and safety of autonomous driving. Despite the success of previous works on cooperative object detection, they mostly operate on dense Bird's Eye View (BEV) feature maps, which are computationally demanding and can hardly be extended to long-range detection problems. More efficient fully sparse frameworks are rarely explored. In this work, we design a fully sparse framework, SparseAlign, with three key features: an enhanced sparse 3D backbone, a query-based temporal context learning module, and a robust detection head specially tailored for sparse features. Extensive experimental results on both OPV2V and DairV2X datasets show that our framework, despite its sparsity, outperforms the state of the art with less communication bandwidth requirements. In addition, experiments on the OPV2Vt and DairV2Xt datasets for time-aligned cooperative object detection also show a significant performance gain compared to the baseline works.

AB - Cooperative perception can increase the view field and decrease the occlusion of an ego vehicle, hence improving the perception performance and safety of autonomous driving. Despite the success of previous works on cooperative object detection, they mostly operate on dense Bird's Eye View (BEV) feature maps, which are computationally demanding and can hardly be extended to long-range detection problems. More efficient fully sparse frameworks are rarely explored. In this work, we design a fully sparse framework, SparseAlign, with three key features: an enhanced sparse 3D backbone, a query-based temporal context learning module, and a robust detection head specially tailored for sparse features. Extensive experimental results on both OPV2V and DairV2X datasets show that our framework, despite its sparsity, outperforms the state of the art with less communication bandwidth requirements. In addition, experiments on the OPV2Vt and DairV2Xt datasets for time-aligned cooperative object detection also show a significant performance gain compared to the baseline works.

KW - cs.CV

KW - data fusion

KW - cooperative perception

KW - point cloud

KW - autonomous driving

UR - http://www.scopus.com/inward/record.url?scp=105017064770&partnerID=8YFLogxK

U2 - 10.1109/CVPR52734.2025.02077

DO - 10.1109/CVPR52734.2025.02077

M3 - Conference contribution

SN - 979-8-3315-4364-8

T3 - CVPR

SP - 22296

EP - 22305

BT - 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

ER -

Von denselben Autoren