StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection

Yunshuang Yuan; Monika Sester

doi:10.1007/978-3-031-91813-1_3

Details

Original language	English
Title of host publication	Computer Vision – ECCV 2024 Workshops, Proceedings
Editors	Alessio Del Bue, Cristian Canton, Jordi Pont-Tuset, Tatiana Tommasi
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	34-51
Number of pages	18
ISBN (print)	9783031918124
Publication status	Published - 2025
Event	18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy Duration: 29 Sept 2024 → 4 Oct 2024

Publication series

Name	Lecture Notes in Computer Science
Volume	15630 LNCS
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Abstract

Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

Keywords

Cooperative Perception, Data Fusion, Point Cloud

ASJC Scopus subject areas

Mathematics(all)
Theoretical Computer Science
Computer Science(all)
General Computer Science

Cite this

StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. / Yuan, Yunshuang; Sester, Monika.
Computer Vision – ECCV 2024 Workshops, Proceedings. ed. / Alessio Del Bue; Cristian Canton; Jordi Pont-Tuset; Tatiana Tommasi. Springer Science and Business Media Deutschland GmbH, 2025. p. 34-51 (Lecture Notes in Computer Science; Vol. 15630 LNCS).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Yuan, Y & Sester, M 2025, StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. in A Del Bue, C Canton, J Pont-Tuset & T Tommasi (eds), Computer Vision – ECCV 2024 Workshops, Proceedings. Lecture Notes in Computer Science, vol. 15630 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 34-51, 18th European Conference on Computer Vision, ECCV 2024, Milan, Italy, 29 Sept 2024. https://doi.org/10.1007/978-3-031-91813-1_3, https://doi.org/10.48550/arXiv.2407.03825

Yuan, Y., & Sester, M. (2025). StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. In A. Del Bue, C. Canton, J. Pont-Tuset, & T. Tommasi (Eds.), Computer Vision – ECCV 2024 Workshops, Proceedings (pp. 34-51). (Lecture Notes in Computer Science; Vol. 15630 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-91813-1_3, https://doi.org/10.48550/arXiv.2407.03825

Yuan Y, Sester M. StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. In Del Bue A, Canton C, Pont-Tuset J, Tommasi T, editors, Computer Vision – ECCV 2024 Workshops, Proceedings. Springer Science and Business Media Deutschland GmbH. 2025. p. 34-51. (Lecture Notes in Computer Science). doi: 10.1007/978-3-031-91813-1_3, 10.48550/arXiv.2407.03825

Yuan, Yunshuang ; Sester, Monika. / StreamLTS : Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. Computer Vision – ECCV 2024 Workshops, Proceedings. editor / Alessio Del Bue ; Cristian Canton ; Jordi Pont-Tuset ; Tatiana Tommasi. Springer Science and Business Media Deutschland GmbH, 2025. pp. 34-51 (Lecture Notes in Computer Science).

Download

@inproceedings{28d50601c65948998a9fbd9cc04bba22,

title = "StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection",

abstract = "Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.",

keywords = "Cooperative Perception, Data Fusion, Point Cloud",

author = "Yunshuang Yuan and Monika Sester",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.; 18th European Conference on Computer Vision, ECCV 2024, ECCV 2024 ; Conference date: 29-09-2024 Through 04-10-2024",

year = "2025",

doi = "10.1007/978-3-031-91813-1_3",

language = "English",

isbn = "9783031918124",

series = "Lecture Notes in Computer Science",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "34--51",

editor = "{Del Bue}, Alessio and Cristian Canton and Jordi Pont-Tuset and Tatiana Tommasi",

booktitle = "Computer Vision – ECCV 2024 Workshops, Proceedings",

address = "Germany",

}

Download

TY - GEN

T1 - StreamLTS

T2 - 18th European Conference on Computer Vision, ECCV 2024

AU - Yuan, Yunshuang

AU - Sester, Monika

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

PY - 2025

Y1 - 2025

N2 - Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

AB - Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

KW - Cooperative Perception

KW - Data Fusion

KW - Point Cloud

UR - http://www.scopus.com/inward/record.url?scp=105006881323&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-91813-1_3

DO - 10.1007/978-3-031-91813-1_3

M3 - Conference contribution

AN - SCOPUS:105006881323

SN - 9783031918124

T3 - Lecture Notes in Computer Science

SP - 34

EP - 51

BT - Computer Vision – ECCV 2024 Workshops, Proceedings

A2 - Del Bue, Alessio

A2 - Canton, Cristian

A2 - Pont-Tuset, Jordi

A2 - Tommasi, Tatiana

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 29 September 2024 through 4 October 2024

ER -

Research@Leibniz University

StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection

Authors

Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

Investigating Effects of Future Path Visualisation on Path Choices During Collision Encounters

3D Uncertain Implicit Surface Mapping Using GMM and GP

SUMob '24: Proceedings of the 2nd ACM SIGSPATIAL Workshop on Sustainable Urban Mobility

Visualising Collision Spot Uncertainty with Augmented Reality

Integrated Multi-Stereo Camera System for Robust Indoor Localization with Temporal Fusion

Investigating Effects of Future Path Visualisation on Path Choices During Collision Encounters

3D Uncertain Implicit Surface Mapping Using GMM and GP

SUMob '24: Proceedings of the 2nd ACM SIGSPATIAL Workshop on Sustainable Urban Mobility

Visualising Collision Spot Uncertainty with Augmented Reality

Integrated Multi-Stereo Camera System for Robust Indoor Localization with Temporal Fusion

Investigating Effects of Future Path Visualisation on Path Choices During Collision Encounters