Loading [MathJax]/extensions/tex2jax.js

StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Details

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2024 Workshops, Proceedings
EditorsAlessio Del Bue, Cristian Canton, Jordi Pont-Tuset, Tatiana Tommasi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages34-51
Number of pages18
ISBN (print)9783031918124
Publication statusPublished - 2025
Event18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy
Duration: 29 Sept 20244 Oct 2024

Publication series

NameLecture Notes in Computer Science
Volume15630 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

Keywords

    Cooperative Perception, Data Fusion, Point Cloud

ASJC Scopus subject areas

Cite this

StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. / Yuan, Yunshuang; Sester, Monika.
Computer Vision – ECCV 2024 Workshops, Proceedings. ed. / Alessio Del Bue; Cristian Canton; Jordi Pont-Tuset; Tatiana Tommasi. Springer Science and Business Media Deutschland GmbH, 2025. p. 34-51 (Lecture Notes in Computer Science; Vol. 15630 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Yuan, Y & Sester, M 2025, StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. in A Del Bue, C Canton, J Pont-Tuset & T Tommasi (eds), Computer Vision – ECCV 2024 Workshops, Proceedings. Lecture Notes in Computer Science, vol. 15630 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 34-51, 18th European Conference on Computer Vision, ECCV 2024, Milan, Italy, 29 Sept 2024. https://doi.org/10.1007/978-3-031-91813-1_3, https://doi.org/10.48550/arXiv.2407.03825
Yuan, Y., & Sester, M. (2025). StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. In A. Del Bue, C. Canton, J. Pont-Tuset, & T. Tommasi (Eds.), Computer Vision – ECCV 2024 Workshops, Proceedings (pp. 34-51). (Lecture Notes in Computer Science; Vol. 15630 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-91813-1_3, https://doi.org/10.48550/arXiv.2407.03825
Yuan Y, Sester M. StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. In Del Bue A, Canton C, Pont-Tuset J, Tommasi T, editors, Computer Vision – ECCV 2024 Workshops, Proceedings. Springer Science and Business Media Deutschland GmbH. 2025. p. 34-51. (Lecture Notes in Computer Science). doi: 10.1007/978-3-031-91813-1_3, 10.48550/arXiv.2407.03825
Yuan, Yunshuang ; Sester, Monika. / StreamLTS : Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection. Computer Vision – ECCV 2024 Workshops, Proceedings. editor / Alessio Del Bue ; Cristian Canton ; Jordi Pont-Tuset ; Tatiana Tommasi. Springer Science and Business Media Deutschland GmbH, 2025. pp. 34-51 (Lecture Notes in Computer Science).
Download
@inproceedings{28d50601c65948998a9fbd9cc04bba22,
title = "StreamLTS: Query-Based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection",
abstract = "Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.",
keywords = "Cooperative Perception, Data Fusion, Point Cloud",
author = "Yunshuang Yuan and Monika Sester",
note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.; 18th European Conference on Computer Vision, ECCV 2024, ECCV 2024 ; Conference date: 29-09-2024 Through 04-10-2024",
year = "2025",
doi = "10.1007/978-3-031-91813-1_3",
language = "English",
isbn = "9783031918124",
series = "Lecture Notes in Computer Science",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "34--51",
editor = "{Del Bue}, Alessio and Cristian Canton and Jordi Pont-Tuset and Tatiana Tommasi",
booktitle = "Computer Vision – ECCV 2024 Workshops, Proceedings",
address = "Germany",

}

Download

TY - GEN

T1 - StreamLTS

T2 - 18th European Conference on Computer Vision, ECCV 2024

AU - Yuan, Yunshuang

AU - Sester, Monika

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

PY - 2025

Y1 - 2025

N2 - Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

AB - Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations. The official code is available at https://github.com/YuanYunshuang/CoSense3D.

KW - Cooperative Perception

KW - Data Fusion

KW - Point Cloud

UR - http://www.scopus.com/inward/record.url?scp=105006881323&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-91813-1_3

DO - 10.1007/978-3-031-91813-1_3

M3 - Conference contribution

AN - SCOPUS:105006881323

SN - 9783031918124

T3 - Lecture Notes in Computer Science

SP - 34

EP - 51

BT - Computer Vision – ECCV 2024 Workshops, Proceedings

A2 - Del Bue, Alessio

A2 - Canton, Cristian

A2 - Pont-Tuset, Jordi

A2 - Tommasi, Tatiana

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 29 September 2024 through 4 October 2024

ER -

By the same author(s)