SAGS: Structure-Aware 3D Gaussian Splatting European Conference on Computer Vision (ECCV) 2024
- Evangelos Ververas1,2*
- Rolandos Alexandros Potamias1,2*
- Jifei Song2
- Jiankang Deng1,2
- Stefanos Zafeiriou1
- Imperial College London1
- Huawei Noah’s Ark Lab2
- *Equal Contribution
Abstract
Following the advent of NeRFs, 3D Gaussian Splatting (3D-GS) has paved the way to real-time neural rendering overcoming the computational burden of volumetric methods. Following the pioneering work of 3D-GS, several methods have attempted to achieve compressible and high-fidelity performance. However, by employing a geometry-agnostic optimization scheme, these methods neglect the inherent 3D structure of the scene, thereby restricting the expressivity and the quality of the representation, resulting in various floating points and artifacts. In this work, we propose a structure-aware Gaussian Splatting method (SAGS) that implicitly encodes the geometry of the scene, which reflects to state-of-the-art rendering performance and reduced storage requirements on benchmark novel-view synthesis datasets. SAGS is founded on a local-global graph representation that facilitates the learning of complex scenes and enforces meaningful point displacements that preserve the scene's geometry. Additionally, we introduce a lightweight version of SAGS, using a simple yet effective mid-point interpolation scheme, which showcases a compact representation of the scene with up to 24 times size reduction without the reliance on any compression strategies. Extensive experiments across multiple benchmark datasets demonstrate the superiority of SAGS compared to state-of-the-art 3D-GS methods under both rendering quality and model size. Besides, we demonstrate that our structure-aware method can effectively mitigate floating artifacts and irregular distortions of previous methods while obtaining precise depth maps.
Method
|
Given a point cloud obtained from COLMAP, we initially apply a curvature-based densification step to populate under-represented areas. We then apply k-NN search to link points within local regions and create a point set graph. Leveraging the inductive biases of graph neural networks, we learn a local-global structural feature for each point. Using a set of small MLPs we decode the structural features to 3D Gaussian attributes, i.e., color, opacity, covariance and point displacements for the initial point position. Finally, we render the 3D Gaussians following the 3D-GS Gaussian rasterizer
Results
Comparison between the proposed and the Scaffold-GS method on the scene's structure preservation. The proposed method can accurately capture sharp edges and suppress 'floater' artifacts that are visible on the Scaffold-GS depth maps.