Songyou Peng (彭崧猷)

I am a PhD student at ETH Zurich and Max Planck Institute for Intelligent Systems under Max Planck ETH Center for Learning Systems & ELLIS. I am co-supervised by Marc Pollefeys and Andreas Geiger.

During my PhD I was a research intern at Google Research working with Tom Funkhouser, and Meta Reality Labs Research, hosted by Michael Zollhoefer. I completed an Erasmus Mundus Masters in Computer Vision and Robotics (VIBOT) with distinction. During the master, I was fortunate to be supervised by Daniel Cremers at Technical University of Munich for my thesis and work with Peter Sturm at INRIA for two summers. Before this, I obtained a Bachelors in Automation at Xi'an Jiaotong University. I have also spent some time at ADSC and A*STAR in Singapore.

Email  |  CV  |  GitHub  |  Google Scholar  |  LinkedIn  |  Twitter

headshot
News
Research

DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models
Shengqu Cai, Eric R. Chan, Songyou Peng, Mohamad Shahbazi, Anton Obukhov, Luc Van Gool, Gordon Wetzstein
International Conference on Computer Vision (ICCV), 2023
paper | project page

A diffusion-model based unsupervised framework capable of synthesizing novel views depicting a long camera trajectory.

OpenScene: 3D Scene Understanding with Open Vocabularies
Songyou Peng, Kyle Genova, Chiyu "Max" Jiang, Andrea Tagliasacchi, Marc Pollefeys, Thomas Funkhouser
Conference on Computer Vision and Pattern Recognition (CVPR), 2023
paper | project page | video | code

Zero-shot approach for novel 3D scene understanding tasks with open-vocabulary queries.

: A Unified Framework for Surface Reconstruction
Zehao Yu, Anpei Chen, Bozidar Antic, Songyou Peng, Apratim Bhattacharyya, Michael Niemeyer, Siyu Tang, Torsten Sattler, Andreas Geiger
Open Source Project, 2023
project page | code

We provide a unified framework and benchmark for neural implicit surface reconstruction.

PatchShading: High-Quality Human Reconstruction by Patch Warping and Shading Refinement
Lixiang Lin, Songyou Peng, Qijun Gan, Jianke Zhu
arXiv, 2022
paper | project page

Shape As Points (SAP) for fast human body reconstruction.

MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction
Zehao Yu, Songyou Peng, Michael Niemeyer, Torsten Sattler, Andreas Geiger
Advances in Neural Information Processing Systems (NeurIPS), 2022
paper | project page

Monocular depth and normal cues significantly boost the performance of neural implicit surface reconstruction methods.

NICE-SLAM: Neural Implicit Scalable Encoding for SLAM
Zihan Zhu*, Songyou Peng*, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R. Oswald, Marc Pollefeys
Conference on Computer Vision and Pattern Recognition (CVPR), 2022
(* equal contribution)
paper | project page | video | code

A neural implicit-based RGB-D SLAM that can be applied to large-scale scenes.

Shape As Points: A Differentiable Poisson Solver
Songyou Peng, Chiyu "Max" Jiang, Yiyi Liao, Michael Niemeyer, Marc Pollefeys, Andreas Geiger
Advances in Neural Information Processing Systems (NeurIPS), 2021 (Oral, top 0.6%)
paper | project page | video (6 min) | video (12 min) | podcast | code

An interpretable hybird shape representation that yields HQ watertight meshes at low inference times.

UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
Michael Oechsle, Songyou Peng, Andreas Geiger
International Conference on Computer Vision (ICCV), 2021 (Oral, top 3%)
paper | project page | video | teaser video | code

Our method enables to reconstruct accurate surfaces without input masks.

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
Christian Reiser, Songyou Peng, Yiyi Liao, Andreas Geiger
International Conference on Computer Vision (ICCV), 2021
paper | project page | blog | video | teaser video | code

Over 2000x speed-ups for NeRF are possible by utilizing thousands of tiny MLPs.

Dynamic Plane Convolutional Occupancy Networks
Stefan Lionar*, Daniil Emtsev*, Dusan Svilarkovic*, Songyou Peng
Winter Conference on Applications of Computer Vision (WACV), 2021
(* equal contribution)
paper | video | code

A student project of 3D Vision course at ETH Zurich where I served as the advisor.

Convolutional Occupancy Networks
Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, Andreas Geiger
European Conference on Computer Vision (ECCV), 2020 (Spotlight, top 5%)
paper | project page | blog | video | teaser video | code

A flexible implicit representation for accurate large-scale 3D reconstruction.

DIST: Rendering Deep Implicit Signed Distance Function with Differentiable Sphere Tracing
Shaohui Liu, Yinda Zhang, Songyou Peng, Boxin Shi, Marc Pollefeys, Zhaopeng Cui
Conference on Computer Vision and Pattern Recognition (CVPR), 2020
paper | project page | teaser video | poster | code

A differentiable renderer for deep implicit signed distance functions.

Calibration Wizard: A Guidance System for Camera Calibration Based on Modelling Geometric and Corner Uncertainty
Songyou Peng and Peter Sturm
International Conference on Computer Vision (ICCV), 2019 (Oral, top 4.6%)
paper | video | poster | code

A novel system that interactively guides a user to take optimal calibration images.

Photometric Depth Super-Resolution
Bjoern Haefner*, Songyou Peng*, Alok Verma*, Yvain Queau, Daniel Cremers
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
(* equal contribution)
paper | project page

Recover high-resolution depth maps with fine geometric details using photometric techniques.

PersEmoN: A Deep Network for Joint Analysis of Apparent Personality, Emotion and Their Relationship
Le Zhang, Songyou Peng, Stefan Winkler
IEEE Transactions on Affective Computing (TAFFC), 2019. In press.
paper | code

A journal extension of our ACM MM 2018 paper.

Give Me One Portrait Image, I Will Tell You Your Emotion and Personality
Songyou Peng, Le Zhang, Stefan Winkler, Marianne Winslett
ACM International Conference on Multimedia (ACM MM), 2018
paper | slides | code

Technical Demo. A deep Siamese-like network is introduced to predict one's Big-Five personality and arousal-valence emotion from one portrait photo.

Depth Super-Resolution Meets Uncalibrated Photometric Stereo
Songyou Peng, Bjoern Haefner, Yvain Queau, Daniel Cremers
International Conference on Computer Vision (ICCV) Workshops, 2017
paper | slides | code & data

A novel depth super-resolution approach for RGB-D sensors is presented.

This paper a part of my master thesis, and subsumed by our TPAMI paper.

High Quality Shape from a RGB-D Camera using Photometric Stereo
Songyou Peng
M.Sc. Thesis, Techinical University of Munich
Supervisor: Yvain Queau and Daniel Cremers
thesis | bibtex | poster

Invited Talks


Dive into Neural Explicit-Implicit 3D Representations and Their Applications
Symposium of Geometry Processing (SGP) Graduate School, 2023 (Invited Lecture)
slides
Learning to Reconstruct and Understand the 3D World
Microsoft Mixed Reality & AI Labs - Zurich, 2023
slides
Learning Neural Scene Representations for 3D Reconstruction and Understanding
Shanghai AI Lab, 2023
slides
OpenScene: 3D Scene Understanding with Open Vocabularies
Peking University, 2023
Apple, 2023
Stability.ai, 2023
slides
How do NeRF and CLIP advance 3D Scene Reconstruction and Understanding
Chinese University of Hong Kong (CUHK) Shenzhen, 2023
Bosch Center for Artificial Intelligence (BCAI), 2023
slides
Large-Scale 3D Scene Reconstruction with NeRF
Stanford University, 2022
slides
Towards Practical Applications of NeRF
Adobe Research, 2022
slides
Neural Scene Representations for 3D Reconstruction
University of Basel, 2022
slides
Shape As Points: A Differentiable Poisson Solver
Talking Papers Podcast, 2022
video | podcast
Shape As Points: A Differentiable Poisson Solver
Graphics And Mixed Environment Seminar (GAMES), 2021
slides | talk (in Chinese)
Towards Practical Applications of NeRF
Graphics And Mixed Environment Seminar (GAMES), 2021
slides | talk (in Chinese)
Selected Projects

3D Textured Shape Recovery with Learned Geometric Priors
Lei Li, Zhizheng Liu, Weining Ren, Liudi Yang, F. Wang, Marc Pollefeys, Songyou Peng
Shape Recovery from Partial Textured 3D Scans (SHARP), 2022
leaderboard | arxiv | code

1st place in reconstructing partial textured objects and 2nd overall.

A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues
Songyou Peng, Le Zhang, Yutong Ban, Meng Fang, Stefan Winkler
IJCNN One-Minute Gradual (OMG) Emotion Behavior Challenge, 2018
leaderboard | arxiv | code

1st for vision-only arousal/valence prediction and 2nd for overall valence prediction.

A Hybrid SLAM and Object Recognition System for Pepper Robot
Songyou Peng*, Kaisar Kushibar*, Paola Ardon*
VIBOT Robotics Project, 2016
arxiv | video | code

Apply visual SLAM on the Pepper robot along with object recognition.

Teaching
Teaching Assistant (Lead), 3D Vision, Spring 2023
Teaching Assistant, Computer Vision, Fall 2022
Teaching Assistant (Lead), 3D Vision, Spring 2022
Teaching Assistant, Deep Learning for Computer Vision: Seminal Work, Spring 2022
Teaching Assistant, 3D Vision, Spring 2020
Teaching Assistant, Deep Learning for Computer Vision: Seminal Work, Spring 2020

Teaching Assistant, Deep Learning, Winter 2020/2021



Academic Services
  • Area Chair: 3DV'24
  • Workshop Organizer: OpenSUN3D at ICCV'23
  • Conference Reviewer: CVPR, ICCV, ECCV, ICLR, NeurIPS, SIGGRAPH, SIGGRAPH Asia
  • Journal Reviewer: TPAMI, CVIU

template adapted from this awesome website