Brain-inspired Lab Peking University

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

spike camera 3D reconstruction gaussian splatting

Kang Chen, Jiyuan Zhang, Zecheng Hao, Yajing Zheng, Tiejun Huang, Zhaofei Yu. USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting. Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025. (Highlight).*

Read the paper

Research Background and Problem

Background

Spike cameras are neuromorphic sensors that record scenes as high-temporal-resolution binary streams, making them highly suitable for 3D reconstruction and novel view synthesis in high-speed dynamic scenarios.

Limitation of Existing Methods

Existing spike-based 3D reconstruction methods usually adopt a cascaded three-stage pipeline – spike-to-image reconstruction, pose estimation, and 3D reconstruction – which suffers from severe error accumulation because inaccuracies in early reconstruction directly degrade downstream pose estimation and rendering quality.

Our Perspective

To address this issue, USP-Gaussian proposes a unified framework that jointly optimizes spike-based image reconstruction, pose correction, and Gaussian splatting in an end-to-end manner. By allowing these components to mutually supervise and refine each other during training, the method effectively reduces the error propagation inherent in conventional cascaded pipelines.

figure image

Contributions

  • Proposes USP-Gaussian, a unified framework that jointly integrates spike-based image reconstruction, camera pose correction, and 3D Gaussian Splatting, instead of treating them as independent sequential stages.
  • Introduces a joint optimization strategy that aligns the reconstructed image sequence from Recon-Net with the rendered sequence from 3DGS, enabling reconstruction and 3D representation learning to provide complementary supervision to each other.
  • Establishes a new self-supervised spike reconstruction paradigm by using multi-view consistency from 3DGS as an additional supervisory signal, rather than relying only on isolated spike-to-image reconstruction objectives.
  • Designs complementary long-short spike inputs and a multi-reblur loss, which improve reconstruction robustness under low spike firing rates, suppress noise, and prevent the model from collapsing to a trivial long-exposure prediction.

Core Method

The core idea of USP-Gaussian is to jointly train two tightly coupled branches. The first branch is Recon-Net, which takes spike streams as input and reconstructs a sequence of sharp images. Instead of using only a short spike window, the method feeds long spikes, short spikes, and a time index into the network so that both global and local temporal information can be exploited, especially under low-texture or low-firing-rate conditions. The second branch is 3D Gaussian Splatting, which uses the current Gaussian primitives and camera poses to render a sharp image sequence over the exposure interval. These two branches are optimized both individually and jointly, so that they can progressively improve each other during training.

figure image

Representative Results

Extensive experiments on both synthetic and real-world datasets show that USP-Gaussian consistently outperforms prior spike-based reconstruction methods. On the synthetic benchmark, the paper reports an average performance of 27.903 PSNR / 0.843 SSIM / 0.217 LPIPS, surpassing the main baseline SpikeGS, which achieves 27.196 / 0.832 / 0.244. In particular, on the Outdoorpool scene, USP-Gaussian reaches 30.142 dB PSNR, which is the best result in the table. These results verify that unified optimization substantially alleviates the error accumulation problem in cascaded pipelines.

figure image
figure image