Details
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 6306-6325 |
Seitenumfang | 20 |
Fachzeitschrift | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Jahrgang | 46 |
Ausgabenummer | 9 |
Frühes Online-Datum | 19 März 2024 |
Publikationsstatus | Veröffentlicht - Sept. 2024 |
Abstract
Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Software
- Informatik (insg.)
- Maschinelles Sehen und Mustererkennung
- Informatik (insg.)
- Theoretische Informatik und Mathematik
- Informatik (insg.)
- Artificial intelligence
- Mathematik (insg.)
- Angewandte Mathematik
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: IEEE Transactions on Pattern Analysis and Machine Intelligence, Jahrgang 46, Nr. 9, 09.2024, S. 6306-6325.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - Robust Shape Fitting for 3D Scene Abstraction
AU - Kluger, Florian
AU - Brachmann, Eric
AU - Yang, Michael Ying
AU - Rosenhahn, Bodo
N1 - Publisher Copyright: © 1979-2012 IEEE.
PY - 2024/9
Y1 - 2024/9
N2 - Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.
AB - Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.
KW - cuboid fitting
KW - Estimation
KW - Image reconstruction
KW - minimal solver
KW - multi-model fitting
KW - Scene abstraction
KW - Shape
KW - shape decomposition
KW - Solid modeling
KW - Surface reconstruction
KW - Three-dimensional displays
KW - Training
UR - http://www.scopus.com/inward/record.url?scp=85188527432&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2403.10452
DO - 10.48550/arXiv.2403.10452
M3 - Article
AN - SCOPUS:85188527432
VL - 46
SP - 6306
EP - 6325
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
SN - 0162-8828
IS - 9
ER -