Andy Zeng

I am a PhD student in the Computer Science department at Princeton University, where I work on artificial intelligence, robotics, and computer vision. I am a part of the Princeton Vision and Robotics Group, advised by Professor Thomas Funkhouser. Before that, I graduated from UC Berkeley with a Bachelors double major in Computer Science and Applied Mathematics.

My research focuses on robotics for vision: how robots and how their role as an active explorer in a 3D world can help to improve perception through a positive feedback loop. Inspired by the way a child interacts with the environment and learns from experience, I am interested in developing algorithms that can enable intelligent systems to learn from their interactions with the physical world, and autonomously acquire the perception and manipulation skills necessary to execute complex tasks.

CV  |  Google Scholar  |  LinkedIn  |  Twitter  |  Github


Invited Talks


Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching
★ 1st place winning solution of the stow task at the Amazon Robotics Challenge 2017! ★
Andy Zeng, Shuran Song, Kuan-Ting Yu, Elliott Donlon, Francois R. Hogan, Maria Bauza, Daolin Ma, Orion Taylor, Melody Liu, Eudald Romo, Nima Fazeli, Ferran Alet, Nikhil Chavan Dafle, Rachel Holladay, Isabella Morona, Prem Qu Nair, Druck Green, Ian Taylor, Weber Liu, Thomas Funkhouser, Alberto Rodriguez
IEEE International Conference on Robotics and Automation (ICRA) 2018
Paper  •   Project Webpage  •   Code (Github)  •   BibTeX
Summary: one step closer to robots that can clean up messy bedrooms ... and recognize where you bought your boxers.

Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View
Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser
Under review at IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018
Paper  •   Project Webpage  •   BibTeX
Summary: can neural nets infer what's behind you? Kinda. We tried.

Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva,
Shuran Song, Andy Zeng, Yinda Zhang
IEEE International Conference on 3D Vision (3DV) 2017
Paper  •   Project Webpage  •   Code (Github)  •   BibTeX
Summary: a massive dataset with 3D scans of rich people's houses, semantically labeled by poor grad students.

3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions
Andy Zeng, Shuran Song, Matthias Nießner, Matthew Fisher, Jianxiong Xiao, Thomas Funkhouser
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
CVPR Oral Presentation  •   Paper  •   Project Webpage  •   Code (Github)  •   BibTeX
Summary: don't toss your leftover data from SLAM. Use it for something interesting, like training 3D shape descriptors.

Semantic Scene Completion from a Single Depth Image
Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
CVPR Oral Presentation  •   Paper  •   Project Webpage  •   SUNCG Dataset  •   Code (Github)  •   BibTeX
Summary: 3D scans only capture partial surfaces, so let's make it fully 3D so with deep learning.

Multi-view Self-supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge
★ 3rd place winning solution at the Amazon Picking Challenge 2016! ★
Andy Zeng, Kuan-Ting Yu, Shuran Song, Daniel Suo, Ed Walker Jr., Alberto Rodriguez, Jianxiong Xiao
IEEE International Conference on Robotics and Automation (ICRA) 2017
Paper  •   Project Webpage  •   Code (Github)  •   BibTeX
Summary: how can we teach robots to recognize the locations and orientations of objects in shelves and bins?


Press Coverage




A re-organized collection of open-source code I've written that you may find useful for your own research.

Undergraduate Projects

Optical Music Recognition

Create an optical music recognition (OMR) system to automatically read images of music sheets, interpret the melody using computer vision techniques, and generate a corresponding .mp3 music file.

Smarter Baxter Robot

Program the Baxter robot to interact with a human via drawings/writings on a small whiteboard. It can do many things, from solving math equations to solving python expressions!

Seam Carving

Shrink images horizontally and/or vertically while preserving as much detail as possible.


Experimenting with Gaussian and Laplacian stacks and multi-resolution blending.

Iris Recognition

Exploring and implementing various computer vision techniques to obtain reasonable accuracy for iris verification and identification.


Produce color images from the digitized Prokudin-Gorskii glass plate images.

Digit Recognition

Implement an algorithm to obtain reasonable digit identification accuracy (on the order of 0.5-3%) over the original MNIST handwriting dataset

Image Stitching Part 1

Experiments with homographies and morphing/warping/blending techniques to stitch images together to form a wide angled panorama.

Image Stitching Part 2

Fully automated point correspondences for image stitching using Harris corners, ANMS, and RANSAC.


Quantifying texture.

Lightfield Camera

Capture an evenly space grid of images over a scene and to perform simple shift/averaging operations in order to synthetically simulate cool effects like depth refocusing and aperture adjustments.

Semi-Autonomous Vehicles

Design and implement computer vision algorithms into a new type of car-safety system which utilizes a driver model to predict future driver steering and braking.

Bionic Exoskeleton

Design and develop computer vision and machine learning supplements for concurrent human mechatronics research for bionic exoskeletons.

Anthropomorphic Hand

Research Project with Bay Area Intellectual Property Group, Patent Firm. Research assistant responsible for computer vision algorithms to enable real-world perception/modeling and path/grasp planning for a robot hand.


General search algorithms applied to help Pacman collect food efficiently.

Face Morphing

Generate animations that morph from one face to another.

Disparity Mapping

Explore a variety of computer vision algorithms for the purpose of computing feature correspondences to create a disparity map post-stereopsis and calibration.

Phong Illumination

Using the generic Phong Illumination Model to perform shading computations from scratch.


Evaluation search design: applying minimax, expectimax, alpha-beta pruning etc. to Pacman and a few ghosts.

Ray Tracer

Everyone needs to write a ray tracer at some point in their life... here's mine!


MDPs, value iteration, Q-learning, reinforcment learning etc. algorithms written in gridworld, then applied to Pacman and a simulated robot controller named Crawler.

Canny Edge Detection

Image pixel intensity derivatives and edge detection. Rewrite a canny edge detection algorithm from scratch and compare to state-of-the-art performance.

Uniform/Adaptive Tessallation

Converts input from a Bézier surface representation to a polygonal representation, applies tessallation, and then displays it.

Bayes' Nets and SMCs (PF)

"Pacman spends his life running from ghosts, but things were not always so. Legend has it that many years ago, Pacman's great grandfather Grandpac learned to hunt ghosts for sport. However, he was blinded by his power and could only track ghosts by their banging and clanging." Particle Filtering!

Inverse Kinematics

A system that computes the minimal change in join angles of an arm (or multiple branching arms) needed to produce the change in endpoint position.


Thoroughly exploring the mathematical models behind the concept of triangulation and stereopsis.

Tour Into the Picture

Using camera homography to create a simple, planar, 3D scene from a single photograph/painting.