Deciphering Perceptual Quality in Colored Point Cloud

None, None; None, None; None, None; None, None; None, None

Deciphering Perceptual Quality in Colored Point Cloud

Prioritizing Geometry or Texture Distortion?

Conference Paper (2024)

Author(s)

X. Zhou (TU Delft - Multimedia Computing, Centrum Wiskunde & Informatica (CWI))

Irene Viola (Centrum Wiskunde & Informatica (CWI))

Yunlu Chen (Carnegie Mellon University)

Jiahuan Pei (Centrum Wiskunde & Informatica (CWI))

P.S. Cesar (TU Delft - Multimedia Computing, Centrum Wiskunde & Informatica (CWI))

Multimedia Computing

DOI related publication

https://doi.org/10.1145/3664647.3680566

Point cloud Multi-modal Multi-task Geometry and texture Objective quality assessment

To reference this document use:

https://resolver.tudelft.nl/uuid:d554ca94-bc8a-4dfc-9ced-46b2637e25a4

More Info

expand_more

Publication Year

2024

Language

English

Multimedia Computing

Pages (from-to)

7813-7822

ISBN (electronic)

979-8-4007-0686-8

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Point clouds represent one of the prevalent formats for 3D content. Distortions introduced at various stages in the point cloud processing pipeline affect the visual quality, altering their geometric composition, texture information, or both. Understanding and quantifying the impact of the distortion domain on visual quality is vital to driving rate optimization and guiding post-processing steps to improve the quality of experience. In this paper, we propose a multi-task guided multi-modality no reference metric (M3-Unity), which utilizes 4 types of modalities across attributes and dimensionalities to represent point clouds. An attention mechanism establishes inter/intra associations among 3D/2D patches, which can complement each other, yielding local and global features, to fit the highly nonlinear property of the human vision system. A multi-task decoder involving distortion type classification selects the best association among 4 modalities, aiding the regression task and enabling the in-depth analysis of the interplay between geometrical and textural distortions. Furthermore, our framework design and attention strategy enable us to measure the impact of individual attributes and their combinations, providing insights into how these associations contribute particularly in relation to distortion type. Extensive experimental results on 4 datasets consistently outperform the state-of-the-art metrics by a large margin.

Files

3664647.3680566.pdf

(pdf | 2.8 Mb)