Stan Szymanowicz

I am a PhD student at VGG group at University of Oxford, where I work on Computer Vision and Machine Learning with Andrea Vedaldi and Christian Rupprecht.

I spent the summer 2024 interning at Google San Francisco, continuing in London until April 2025.

Before my PhD, I worked at Microsoft Mixed Reality Lab (2021-2022) in Cambridge and received an MEng with Distinction from the University of Cambridge (2017-2021).

Email  /  Google Scholar  /  Twitter  /  Github

profile photo

Research

I am interested in Computer Vision (with an emphasis on 3D), Computer Graphics and Generative Models.

Bolt3D: Generating 3D Scenes in Seconds
Stanislaw Szymanowicz, Jason Y. Zhang, Pratul Srinivasan, Ruiqi Gao, Arthur Brussee, Aleksander Holynski, Ricardo Martin-Brualla, Jonathan T. Barron, Philipp Henzler
arXiv, 2025
project page, with an interactive viewer in Chrome / arXiv

Bolt3D allows feed-forward 3D scene generation in 6.25s on a single GPU. Our latent diffusion model directly outputs 3D Gaussians, and is trained on a large-scale dataset of reconstructed 3D scenes.

Flash3D: Feed-Forward Generalisable Scene Reconstruction from a Single Image
Stanislaw Szymanowicz*, Eldar Insafutdinov*, Chuanxia Zheng*, Dylan Campbell, Joao Henriques, Christian Rupprecht, Andrea Vedaldi
3DV, 2025
code / project page / arXiv

We show that leveraging a Monocular Depth Estimator and a Layered Representation allows building generalizable and efficient 3D scene reconstructors.

Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Stanislaw Szymanowicz, Chrisitian Rupprecht, Andrea Vedaldi
CVPR, 2024
code / project page / demo / arXiv

We design a fast (38FPS), simple, 2D network for single-view 3D reconstruction that represents shapes with 3D Gaussians. As a result, it can leverage Gaussian Splatting for rendering (588FPS), achieves state-of-the-art quality in several cases and is trains on just a single GPU.

Viewset Diffusion: (0-)Image-Conditioned 3D Generation from 2D Data
Stanislaw Szymanowicz, Chrisitian Rupprecht, Andrea Vedaldi
ICCV, 2023
code / project page / arXiv / paper

Novel formulation of the denoising function in Diffusion Models lets us train 3D generative models from 2D data only. Our models can perform both few-view 3D reconstruction and 3D generation.

Photo-realistic 360◦ Head Avatars in the Wild
Stanislaw Szymanowicz, Virginia Estellers, Tadas Baltrusaitis, Matthew Johnson
ECCV Workshop on Computer Vision for Metaverse, 2022
arXiv

We unlock training NeRFs of 360◦ human heads from mobile phone captures. Synthetically-trained face keypoint detector allows to get a rough camera pose estimate which is then refined in the optimization process.

VolTeMorph: Realtime, Controllable and Generalisable Animation of Volumetric Representations
Stephan Garbin*, Marek Kowalski*, Virginia Estellers*, Stanislaw Szymanowicz*, Shideh Rezaeifar, Jingjing Shen, Matthew Johnson, Julien Valentin
arXiv, 2022
arXiv

We can controllably animate Neural Radiance Fields by deforming tetrahedral cages. Tetrahedral connectivity allows us to run the animations in real-time.

Discrete neural representations for explainable anomaly detection
Stanislaw Szymanowicz, James Charles, Roberto Cipolla
WACV, 2022
project page / video / arXiv

We introduce an architecture for improved explainability of anomaly detection in CCTV videos.

X-MAN: Explaining multiple sources of anomalies in video
Stanislaw Szymanowicz, James Charles, Roberto Cipolla
CVPR Workshop on Fair and Trusted Computer Vision, 2021
project page / demo video / paper

We introduce a dataset and a method for detecting and explaining anomalies in CCTV videos.


Thank you to Jon Barron for the source code for the website!