EVPGS: Enhanced View Prior Guidance for Splatting-based Extrapolated View Synthesis

CVPR 2025

Jiahe Li, Feiyu Wang*, Xiaochao Qu, Chengjing Wu, Luoqi Liu*, Ting Liu*
MT Lab, Meitu Inc.
*Corresponding Author

EVPGS is a GS-based framework for Extrapolated View Synthesis (EVS) that synthesize high-quality extrapolated novel views with limited training view coverage.

Abstract

Gaussian Splatting (GS)-based methods rely on sufficient training view coverage and perform synthesis on interpolated views. In this work, we tackle the more challenging and underexplored Extrapolated View Synthesis (EVS) task. Here we enable GS-based models trained with limited view coverage to generalize well to extrapolated views. To achieve our goal, we propose a view augmentation framework to guide training through a coarse-to-fine process. At the coarse stage, we reduce rendering artifacts due to insufficient view coverage by introducing a regularization strategy at both appearance and geometry levels. At the fine stage, we generate reliable view priors to provide further training guidance. To this end, we incorporate an occlusion awareness into the view prior generation process, and refine the view priors with the aid of coarse stage output. We call our framework Enhanced View Prior Guidance for Splatting (EVPGS). To comprehensively evaluate EVPGS on the EVS task, we collect a real-world dataset called Merchandise3D dedicated to the EVS scenario. Experiments on three datasets including both real and synthetic demonstrate EVPGS achieves state-of-the-art performance, while improving synthesis quality at extrapolated views for GS-based methods both qualitatively and quantitatively.

Framework Pipeline

Overall Framework. EVPGS first pre-trains a GS model using the training set with limited view coverage (e.g., horizontal views) and then fine-tunes it on augmented views (e.g., elevated views) via a coarse-to-fine process. At the coarse stage, our Appearance and Geometry Regularization (AGR) strategy reduces artifacts in augmented views using the Denoising Diffusion Model and the reconstructed mesh from the pre-trained model. At the fine stage, our Occlusion-Aware Reprojection and Refinement (OARR) strategy generates Enhanced View Priors as pseudo-labels, addressing occlusions and incorporating view-dependent colors from the coarse stage.

Visual Comparisons

EVPGS(RaDe-GS)
RaDe-GS [Zhang 2024]
EVPGS(RaDe-GS)
RaDe-GS [Zhang 2024]
EVPGS(2DGS)
2DGS [Huang 2024]
EVPGS(2DGS)
2DGS [Huang 2024]
EVPGS(Mip-Splatting)
Mip-Splatting [Yu 2023]
EVPGS(Mip-Splatting)
Mip-Splatting [Yu 2023]
EVPGS(RaDe-GS)
RaDe-GS [Zhang 2024]
EVPGS(RaDe-GS)
RaDe-GS [Zhang 2024]
EVPGS(GOF)
GOF [Yu 2024]
EVPGS(GOF)
GOF [Yu 2024]
EVPGS(3DGS)
3DGS [Kerbl 2023]
EVPGS(3DGS)
3DGS [Kerbl 2023]

Application: Merchandise Exhibition

Traditional merchandise exhibitions typically require a professional photographer to capture objects along predefined camera paths. With EVPGS, users can simply record a short video around the object using a smartphone, and our method will generate high-quality extrapolated novel views effortlessly.