Details
| Original language | English |
|---|---|
| Article number | 250 |
| Journal | ACM Transactions on Multimedia Computing, Communications, and Applications |
| Volume | 21 |
| Issue number | 9 |
| Publication status | Published - 11 Sept 2025 |
Abstract
Dynamic Point Clouds are a representation for three-dimensional (3D) immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point cloud videos accessible. Addressing this, Video-based Point Cloud Compression (V-PCC) projects points into 2D patches to compress video frames, leveraging the high compression efficiency of legacy video codecs and exploiting temporal correlations in the two-dimensional (2D) images. However, clustering and projecting points into meaningful 2D patches is computationally intensive, leading to high encoding latency in V-PCC. Applying adaptive streaming techniques, originating from traditional video streaming, multiplies the computational effort as multiple encodings of the same content are required. In this light, transcoding a compressed representation into lower qualities for dynamic adaptation to user requirements is gaining popularity. To address the high latency when employing the full decoder-encoder stack of V-PCC during transcoding, we propose RABBIT, a novel technique that only re-encodes the underlying video sub-streams. This is in contrast to slow V-PCC transcoding that reconstructs and re-encodes the raw point cloud at a new quality setting. By eliminating expensive overhead resulting from calculations based on the 3D space representation, the latency of RABBIT is bounded by the latency of transcoding the underlying video streams, allowing optimized video codec implementations to be used to meet the real-time requirements of adaptive streaming systems. Our evaluations of RABBIT, using various optimized video codec implementations, shows on-par quality with the baseline V-PCC transcoding given a high-quality representation. Given unicast or multicast distribution of a point cloud stream and in-network or edge transcoders, our evaluations show the tradeoff between rate-distortion performance and the required network bandwidth.
Keywords
- 6DoF, Adaptive Streaming, Point Cloud, Transcoding, Virtual Reality
ASJC Scopus subject areas
- Computer Science(all)
- Hardware and Architecture
- Computer Science(all)
- Computer Networks and Communications
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 21, No. 9, 250, 11.09.2025.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Transcoding V-PCC Point Cloud Streams in Real-time
AU - Rudolph, Michael
AU - Schneegass, Stefan
AU - Rizk, Amr
N1 - Publisher Copyright: © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2025/9/11
Y1 - 2025/9/11
N2 - Dynamic Point Clouds are a representation for three-dimensional (3D) immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point cloud videos accessible. Addressing this, Video-based Point Cloud Compression (V-PCC) projects points into 2D patches to compress video frames, leveraging the high compression efficiency of legacy video codecs and exploiting temporal correlations in the two-dimensional (2D) images. However, clustering and projecting points into meaningful 2D patches is computationally intensive, leading to high encoding latency in V-PCC. Applying adaptive streaming techniques, originating from traditional video streaming, multiplies the computational effort as multiple encodings of the same content are required. In this light, transcoding a compressed representation into lower qualities for dynamic adaptation to user requirements is gaining popularity. To address the high latency when employing the full decoder-encoder stack of V-PCC during transcoding, we propose RABBIT, a novel technique that only re-encodes the underlying video sub-streams. This is in contrast to slow V-PCC transcoding that reconstructs and re-encodes the raw point cloud at a new quality setting. By eliminating expensive overhead resulting from calculations based on the 3D space representation, the latency of RABBIT is bounded by the latency of transcoding the underlying video streams, allowing optimized video codec implementations to be used to meet the real-time requirements of adaptive streaming systems. Our evaluations of RABBIT, using various optimized video codec implementations, shows on-par quality with the baseline V-PCC transcoding given a high-quality representation. Given unicast or multicast distribution of a point cloud stream and in-network or edge transcoders, our evaluations show the tradeoff between rate-distortion performance and the required network bandwidth.
AB - Dynamic Point Clouds are a representation for three-dimensional (3D) immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point cloud videos accessible. Addressing this, Video-based Point Cloud Compression (V-PCC) projects points into 2D patches to compress video frames, leveraging the high compression efficiency of legacy video codecs and exploiting temporal correlations in the two-dimensional (2D) images. However, clustering and projecting points into meaningful 2D patches is computationally intensive, leading to high encoding latency in V-PCC. Applying adaptive streaming techniques, originating from traditional video streaming, multiplies the computational effort as multiple encodings of the same content are required. In this light, transcoding a compressed representation into lower qualities for dynamic adaptation to user requirements is gaining popularity. To address the high latency when employing the full decoder-encoder stack of V-PCC during transcoding, we propose RABBIT, a novel technique that only re-encodes the underlying video sub-streams. This is in contrast to slow V-PCC transcoding that reconstructs and re-encodes the raw point cloud at a new quality setting. By eliminating expensive overhead resulting from calculations based on the 3D space representation, the latency of RABBIT is bounded by the latency of transcoding the underlying video streams, allowing optimized video codec implementations to be used to meet the real-time requirements of adaptive streaming systems. Our evaluations of RABBIT, using various optimized video codec implementations, shows on-par quality with the baseline V-PCC transcoding given a high-quality representation. Given unicast or multicast distribution of a point cloud stream and in-network or edge transcoders, our evaluations show the tradeoff between rate-distortion performance and the required network bandwidth.
KW - 6DoF
KW - Adaptive Streaming
KW - Point Cloud
KW - Transcoding
KW - Virtual Reality
UR - http://www.scopus.com/inward/record.url?scp=105019089282&partnerID=8YFLogxK
U2 - 10.1145/3682062
DO - 10.1145/3682062
M3 - Article
VL - 21
JO - ACM Transactions on Multimedia Computing, Communications, and Applications
JF - ACM Transactions on Multimedia Computing, Communications, and Applications
SN - 1551-6865
IS - 9
M1 - 250
ER -