Details
Original language | English |
---|---|
Pages (from-to) | 499-516 |
Number of pages | 18 |
Journal | PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science |
Volume | 92 |
Issue number | 5 |
Early online date | 16 Sept 2024 |
Publication status | Published - Oct 2024 |
Abstract
Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of cm in planimetry and cm in height for keypoints defining the car shape.
Keywords
- Autonomous driving, Object detection, Object reconstruction, Pose estimation, Shape estimation
ASJC Scopus subject areas
- Social Sciences(all)
- Geography, Planning and Development
- Physics and Astronomy(all)
- Instrumentation
- Earth and Planetary Sciences(all)
- Earth and Planetary Sciences (miscellaneous)
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science, Vol. 92, No. 5, 10.2024, p. 499-516.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN
AU - El Amrani Abouelassad, S.
AU - Mehltretter, M.
AU - Rottensteiner, F.
N1 - Publisher Copyright: © The Author(s) 2024.
PY - 2024/10
Y1 - 2024/10
N2 - Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of cm in planimetry and cm in height for keypoints defining the car shape.
AB - Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of cm in planimetry and cm in height for keypoints defining the car shape.
KW - Autonomous driving
KW - Object detection
KW - Object reconstruction
KW - Pose estimation
KW - Shape estimation
UR - http://www.scopus.com/inward/record.url?scp=85204012518&partnerID=8YFLogxK
U2 - 10.1007/s41064-024-00311-0
DO - 10.1007/s41064-024-00311-0
M3 - Article
AN - SCOPUS:85204012518
VL - 92
SP - 499
EP - 516
JO - PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science
JF - PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science
SN - 2512-2789
IS - 5
ER -