Transcoding V-PCC Point Cloud Streams in Real-time

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Michael Rudolph
  • Stefan Schneegass
  • Amr Rizk
View graph of relations

Details

Original languageEnglish
Article number250
JournalACM Transactions on Multimedia Computing, Communications, and Applications
Volume21
Issue number9
Publication statusPublished - 11 Sept 2025

Abstract

Dynamic Point Clouds are a representation for three-dimensional (3D) immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point cloud videos accessible. Addressing this, Video-based Point Cloud Compression (V-PCC) projects points into 2D patches to compress video frames, leveraging the high compression efficiency of legacy video codecs and exploiting temporal correlations in the two-dimensional (2D) images. However, clustering and projecting points into meaningful 2D patches is computationally intensive, leading to high encoding latency in V-PCC. Applying adaptive streaming techniques, originating from traditional video streaming, multiplies the computational effort as multiple encodings of the same content are required. In this light, transcoding a compressed representation into lower qualities for dynamic adaptation to user requirements is gaining popularity. To address the high latency when employing the full decoder-encoder stack of V-PCC during transcoding, we propose RABBIT, a novel technique that only re-encodes the underlying video sub-streams. This is in contrast to slow V-PCC transcoding that reconstructs and re-encodes the raw point cloud at a new quality setting. By eliminating expensive overhead resulting from calculations based on the 3D space representation, the latency of RABBIT is bounded by the latency of transcoding the underlying video streams, allowing optimized video codec implementations to be used to meet the real-time requirements of adaptive streaming systems. Our evaluations of RABBIT, using various optimized video codec implementations, shows on-par quality with the baseline V-PCC transcoding given a high-quality representation. Given unicast or multicast distribution of a point cloud stream and in-network or edge transcoders, our evaluations show the tradeoff between rate-distortion performance and the required network bandwidth.

Keywords

    6DoF, Adaptive Streaming, Point Cloud, Transcoding, Virtual Reality

ASJC Scopus subject areas

Cite this

Transcoding V-PCC Point Cloud Streams in Real-time. / Rudolph, Michael; Schneegass, Stefan; Rizk, Amr.
In: ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 21, No. 9, 250, 11.09.2025.

Research output: Contribution to journalArticleResearchpeer review

Rudolph, Michael ; Schneegass, Stefan ; Rizk, Amr. / Transcoding V-PCC Point Cloud Streams in Real-time. In: ACM Transactions on Multimedia Computing, Communications, and Applications. 2025 ; Vol. 21, No. 9.
Download
@article{d3ea92bfe7f5470eb4353a5a671b0558,
title = "Transcoding V-PCC Point Cloud Streams in Real-time",
abstract = "Dynamic Point Clouds are a representation for three-dimensional (3D) immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point cloud videos accessible. Addressing this, Video-based Point Cloud Compression (V-PCC) projects points into 2D patches to compress video frames, leveraging the high compression efficiency of legacy video codecs and exploiting temporal correlations in the two-dimensional (2D) images. However, clustering and projecting points into meaningful 2D patches is computationally intensive, leading to high encoding latency in V-PCC. Applying adaptive streaming techniques, originating from traditional video streaming, multiplies the computational effort as multiple encodings of the same content are required. In this light, transcoding a compressed representation into lower qualities for dynamic adaptation to user requirements is gaining popularity. To address the high latency when employing the full decoder-encoder stack of V-PCC during transcoding, we propose RABBIT, a novel technique that only re-encodes the underlying video sub-streams. This is in contrast to slow V-PCC transcoding that reconstructs and re-encodes the raw point cloud at a new quality setting. By eliminating expensive overhead resulting from calculations based on the 3D space representation, the latency of RABBIT is bounded by the latency of transcoding the underlying video streams, allowing optimized video codec implementations to be used to meet the real-time requirements of adaptive streaming systems. Our evaluations of RABBIT, using various optimized video codec implementations, shows on-par quality with the baseline V-PCC transcoding given a high-quality representation. Given unicast or multicast distribution of a point cloud stream and in-network or edge transcoders, our evaluations show the tradeoff between rate-distortion performance and the required network bandwidth.",
keywords = "6DoF, Adaptive Streaming, Point Cloud, Transcoding, Virtual Reality",
author = "Michael Rudolph and Stefan Schneegass and Amr Rizk",
note = "Publisher Copyright: {\textcopyright} 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.",
year = "2025",
month = sep,
day = "11",
doi = "10.1145/3682062",
language = "English",
volume = "21",
journal = "ACM Transactions on Multimedia Computing, Communications, and Applications",
issn = "1551-6865",
publisher = "Association for Computing Machinery (ACM)",
number = "9",

}

Download

TY - JOUR

T1 - Transcoding V-PCC Point Cloud Streams in Real-time

AU - Rudolph, Michael

AU - Schneegass, Stefan

AU - Rizk, Amr

N1 - Publisher Copyright: © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.

PY - 2025/9/11

Y1 - 2025/9/11

N2 - Dynamic Point Clouds are a representation for three-dimensional (3D) immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point cloud videos accessible. Addressing this, Video-based Point Cloud Compression (V-PCC) projects points into 2D patches to compress video frames, leveraging the high compression efficiency of legacy video codecs and exploiting temporal correlations in the two-dimensional (2D) images. However, clustering and projecting points into meaningful 2D patches is computationally intensive, leading to high encoding latency in V-PCC. Applying adaptive streaming techniques, originating from traditional video streaming, multiplies the computational effort as multiple encodings of the same content are required. In this light, transcoding a compressed representation into lower qualities for dynamic adaptation to user requirements is gaining popularity. To address the high latency when employing the full decoder-encoder stack of V-PCC during transcoding, we propose RABBIT, a novel technique that only re-encodes the underlying video sub-streams. This is in contrast to slow V-PCC transcoding that reconstructs and re-encodes the raw point cloud at a new quality setting. By eliminating expensive overhead resulting from calculations based on the 3D space representation, the latency of RABBIT is bounded by the latency of transcoding the underlying video streams, allowing optimized video codec implementations to be used to meet the real-time requirements of adaptive streaming systems. Our evaluations of RABBIT, using various optimized video codec implementations, shows on-par quality with the baseline V-PCC transcoding given a high-quality representation. Given unicast or multicast distribution of a point cloud stream and in-network or edge transcoders, our evaluations show the tradeoff between rate-distortion performance and the required network bandwidth.

AB - Dynamic Point Clouds are a representation for three-dimensional (3D) immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point cloud videos accessible. Addressing this, Video-based Point Cloud Compression (V-PCC) projects points into 2D patches to compress video frames, leveraging the high compression efficiency of legacy video codecs and exploiting temporal correlations in the two-dimensional (2D) images. However, clustering and projecting points into meaningful 2D patches is computationally intensive, leading to high encoding latency in V-PCC. Applying adaptive streaming techniques, originating from traditional video streaming, multiplies the computational effort as multiple encodings of the same content are required. In this light, transcoding a compressed representation into lower qualities for dynamic adaptation to user requirements is gaining popularity. To address the high latency when employing the full decoder-encoder stack of V-PCC during transcoding, we propose RABBIT, a novel technique that only re-encodes the underlying video sub-streams. This is in contrast to slow V-PCC transcoding that reconstructs and re-encodes the raw point cloud at a new quality setting. By eliminating expensive overhead resulting from calculations based on the 3D space representation, the latency of RABBIT is bounded by the latency of transcoding the underlying video streams, allowing optimized video codec implementations to be used to meet the real-time requirements of adaptive streaming systems. Our evaluations of RABBIT, using various optimized video codec implementations, shows on-par quality with the baseline V-PCC transcoding given a high-quality representation. Given unicast or multicast distribution of a point cloud stream and in-network or edge transcoders, our evaluations show the tradeoff between rate-distortion performance and the required network bandwidth.

KW - 6DoF

KW - Adaptive Streaming

KW - Point Cloud

KW - Transcoding

KW - Virtual Reality

UR - http://www.scopus.com/inward/record.url?scp=105019089282&partnerID=8YFLogxK

U2 - 10.1145/3682062

DO - 10.1145/3682062

M3 - Article

VL - 21

JO - ACM Transactions on Multimedia Computing, Communications, and Applications

JF - ACM Transactions on Multimedia Computing, Communications, and Applications

SN - 1551-6865

IS - 9

M1 - 250

ER -