(* and † indicate equal first authorship or equal last authorship.)
![]() |
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
Hanzhi Chen, Boyang Sun, Anran Zhang, Marc Pollefeys, Stefan Leutenegger CVPR, 2025 project page / paper / video / code We present a framework to learn 3D affordance from in-the-wild human videos, enabling zero-shot robotic manipulation in real-world environments. |
![]() |
FuncGrasp: Learning Object-Centric Neural Grasp Functions from Single Annotated Example Object
Hanzhi Chen, Binbin Xu, Stefan Leutenegger ICRA, 2024 project page / paper / video We propose a framework to infer continuous grasp functions of unseen objects using only one annotated example. |
![]() |
Anthropomorphic Grasping with Neural Object Shape Completion
Diego Hidalgo Carvajal*, Hanzhi Chen*, Gemma Bettelani, Jaesug Jung, Melissa Zavaglia, Laura Busse, Abdeldjallil Naceri, Stefan Leutenegger†, Sami Haddadin† RA-L, 2023 project page / paper / video We design an anthropomorphic grasping system capable of manipulating previously unseen objects using single-view visual input. |
![]() |
TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation
Hanzhi Chen, Fabian Manhardt, Nassir Navab, Benjamin Busam CVPR, 2023 project page / paper / video / code We re-formulate self-supervised 6D object pose estimation as two sub-optimization problems on texture learning and pose learning. |
![]() |
Attention meets Geometry: Geometry Guided Spatial-Temporal Attention for Consistent
Self-Supervised Monocular Depth Estimation
Hanzhi Chen*, Daoyi Gao*, Patrick Ruhkamp, Nassir Navab, Benjamin Busam 3DV, 2021 project page / paper / video / code We propose a self-supervised monocular depth estimator providing temporally coherent reconstruction. |
![]() |
1st Workshop on Urban Scene Modeling: Where Vision Meets Photogrammetry and Graphics
CVPR, 2024 workshop page |