Details
Original language | English |
---|---|
Title of host publication | Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 5852-5862 |
Number of pages | 11 |
ISBN (electronic) | 9798331510831 |
ISBN (print) | 979-8-3315-1084-8 |
Publication status | Published - 26 Feb 2025 |
Event | 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025 - Tucson, United States Duration: 28 Feb 2025 → 4 Mar 2025 |
Publication series
Name | IEEE Winter Conference on Applications of Computer Vision |
---|---|
ISSN (Print) | 2472-6737 |
ISSN (electronic) | 2642-9381 |
Abstract
Monocular 3D human pose and shape estimation is an inherently ill-posed problem due to depth ambiguities, occlusions, and truncations. Recent probabilistic approaches learn a distribution over plausible 3D human meshes by maximizing the likelihood of the ground-truth pose given an image. We show that this objective function alone is not sufficient to best capture the full distributions. Instead, we propose to additionally supervise the learned distributions by minimizing the distance to distributions encoded in heatmaps of a 2D pose detector. Moreover, we reveal that current methods often generate incorrect hypotheses for invisible joints which is not detected by the evaluation protocols. We demonstrate that person segmentation masks can be utilized during training to significantly decrease the number of invalid samples and introduce two metrics to evaluate it. Our normalizing flow-based approach predicts plausible 3D human mesh hypotheses that are consistent with the image evidence while maintaining high diversity for ambiguous body parts. Experiments on 3DPW and EMDB show that we outperform other state-of-the-art probabilistic methods. Code is available for research purposes at https://github.com/twehrbein/humr.
ASJC Scopus subject areas
- Computer Science(all)
- Artificial Intelligence
- Computer Science(all)
- Computer Science Applications
- Computer Science(all)
- Computer Vision and Pattern Recognition
- Computer Science(all)
- Human-Computer Interaction
- Mathematics(all)
- Modelling and Simulation
- Medicine(all)
- Radiology Nuclear Medicine and imaging
Sustainable Development Goals
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025. Institute of Electrical and Electronics Engineers Inc., 2025. p. 5852-5862 (IEEE Winter Conference on Applications of Computer Vision).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Utilizing Uncertainty in 2D Pose Detectors for Probabilistic 3D Human Mesh Recovery
AU - Wehrbein, Tom
AU - Rudolph, Marco
AU - Rosenhahn, Bodo
AU - Wandt, Bastian
N1 - Publisher Copyright: © 2025 IEEE.
PY - 2025/2/26
Y1 - 2025/2/26
N2 - Monocular 3D human pose and shape estimation is an inherently ill-posed problem due to depth ambiguities, occlusions, and truncations. Recent probabilistic approaches learn a distribution over plausible 3D human meshes by maximizing the likelihood of the ground-truth pose given an image. We show that this objective function alone is not sufficient to best capture the full distributions. Instead, we propose to additionally supervise the learned distributions by minimizing the distance to distributions encoded in heatmaps of a 2D pose detector. Moreover, we reveal that current methods often generate incorrect hypotheses for invisible joints which is not detected by the evaluation protocols. We demonstrate that person segmentation masks can be utilized during training to significantly decrease the number of invalid samples and introduce two metrics to evaluate it. Our normalizing flow-based approach predicts plausible 3D human mesh hypotheses that are consistent with the image evidence while maintaining high diversity for ambiguous body parts. Experiments on 3DPW and EMDB show that we outperform other state-of-the-art probabilistic methods. Code is available for research purposes at https://github.com/twehrbein/humr.
AB - Monocular 3D human pose and shape estimation is an inherently ill-posed problem due to depth ambiguities, occlusions, and truncations. Recent probabilistic approaches learn a distribution over plausible 3D human meshes by maximizing the likelihood of the ground-truth pose given an image. We show that this objective function alone is not sufficient to best capture the full distributions. Instead, we propose to additionally supervise the learned distributions by minimizing the distance to distributions encoded in heatmaps of a 2D pose detector. Moreover, we reveal that current methods often generate incorrect hypotheses for invisible joints which is not detected by the evaluation protocols. We demonstrate that person segmentation masks can be utilized during training to significantly decrease the number of invalid samples and introduce two metrics to evaluate it. Our normalizing flow-based approach predicts plausible 3D human mesh hypotheses that are consistent with the image evidence while maintaining high diversity for ambiguous body parts. Experiments on 3DPW and EMDB show that we outperform other state-of-the-art probabilistic methods. Code is available for research purposes at https://github.com/twehrbein/humr.
UR - http://www.scopus.com/inward/record.url?scp=105003627374&partnerID=8YFLogxK
U2 - 10.1109/WACV61041.2025.00571
DO - 10.1109/WACV61041.2025.00571
M3 - Conference contribution
AN - SCOPUS:105003627374
SN - 979-8-3315-1084-8
T3 - IEEE Winter Conference on Applications of Computer Vision
SP - 5852
EP - 5862
BT - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
Y2 - 28 February 2025 through 4 March 2025
ER -