StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksComputer Vision – ECCV 2024 Workshops, Proceedings
Herausgeber/-innenAlessio Del Bue, Cristian Canton, Jordi Pont-Tuset, Tatiana Tommasi
Herausgeber (Verlag)Springer Science and Business Media Deutschland GmbH
Seiten34-51
Seitenumfang18
ISBN (Print)9783031918124
PublikationsstatusVeröffentlicht - 2025
Veranstaltung18th European Conference on Computer Vision, ECCV 2024 - Milan, Italien
Dauer: 29 Sept. 20244 Okt. 2024

Publikationsreihe

NameLecture Notes in Computer Science
Band15630 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Abstract

Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

ASJC Scopus Sachgebiete

Zitieren

StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. / Yuan, Yunshuang; Sester, Monika.
Computer Vision – ECCV 2024 Workshops, Proceedings. Hrsg. / Alessio Del Bue; Cristian Canton; Jordi Pont-Tuset; Tatiana Tommasi. Springer Science and Business Media Deutschland GmbH, 2025. S. 34-51 (Lecture Notes in Computer Science; Band 15630 LNCS).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Yuan, Y & Sester, M 2025, StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. in A Del Bue, C Canton, J Pont-Tuset & T Tommasi (Hrsg.), Computer Vision – ECCV 2024 Workshops, Proceedings. Lecture Notes in Computer Science, Bd. 15630 LNCS, Springer Science and Business Media Deutschland GmbH, S. 34-51, 18th European Conference on Computer Vision, ECCV 2024, Milan, Italien, 29 Sept. 2024. https://doi.org/10.1007/978-3-031-91813-1_3, https://doi.org/10.48550/arXiv.2407.03825
Yuan, Y., & Sester, M. (2025). StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. In A. Del Bue, C. Canton, J. Pont-Tuset, & T. Tommasi (Hrsg.), Computer Vision – ECCV 2024 Workshops, Proceedings (S. 34-51). (Lecture Notes in Computer Science; Band 15630 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-91813-1_3, https://doi.org/10.48550/arXiv.2407.03825
Yuan Y, Sester M. StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. in Del Bue A, Canton C, Pont-Tuset J, Tommasi T, Hrsg., Computer Vision – ECCV 2024 Workshops, Proceedings. Springer Science and Business Media Deutschland GmbH. 2025. S. 34-51. (Lecture Notes in Computer Science). doi: 10.1007/978-3-031-91813-1_3, 10.48550/arXiv.2407.03825
Yuan, Yunshuang ; Sester, Monika. / StreamLTS : Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. Computer Vision – ECCV 2024 Workshops, Proceedings. Hrsg. / Alessio Del Bue ; Cristian Canton ; Jordi Pont-Tuset ; Tatiana Tommasi. Springer Science and Business Media Deutschland GmbH, 2025. S. 34-51 (Lecture Notes in Computer Science).
Download
@inproceedings{28d50601c65948998a9fbd9cc04bba22,
title = "StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection",
abstract = "Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.",
keywords = "Cooperative Perception, Data Fusion, Point Cloud",
author = "Yunshuang Yuan and Monika Sester",
note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.; 18th European Conference on Computer Vision, ECCV 2024, ECCV 2024 ; Conference date: 29-09-2024 Through 04-10-2024",
year = "2025",
doi = "10.1007/978-3-031-91813-1_3",
language = "English",
isbn = "9783031918124",
series = "Lecture Notes in Computer Science",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "34--51",
editor = "{Del Bue}, Alessio and Cristian Canton and Jordi Pont-Tuset and Tatiana Tommasi",
booktitle = "Computer Vision – ECCV 2024 Workshops, Proceedings",
address = "Germany",

}

Download

TY - GEN

T1 - StreamLTS

T2 - 18th European Conference on Computer Vision, ECCV 2024

AU - Yuan, Yunshuang

AU - Sester, Monika

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

PY - 2025

Y1 - 2025

N2 - Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

AB - Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

KW - Cooperative Perception

KW - Data Fusion

KW - Point Cloud

UR - http://www.scopus.com/inward/record.url?scp=105006881323&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-91813-1_3

DO - 10.1007/978-3-031-91813-1_3

M3 - Conference contribution

AN - SCOPUS:105006881323

SN - 9783031918124

T3 - Lecture Notes in Computer Science

SP - 34

EP - 51

BT - Computer Vision – ECCV 2024 Workshops, Proceedings

A2 - Del Bue, Alessio

A2 - Canton, Cristian

A2 - Pont-Tuset, Jordi

A2 - Tommasi, Tatiana

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 29 September 2024 through 4 October 2024

ER -

Von denselben Autoren