Qianxu Wang

I am a first-year PhD student at Cornell University, advised by Prof. Kuan Fang. Previously, I was fortunate to work with Prof. Yixin Zhu at Peking University, Prof. Jeannette Bohg at Stanford, and Prof. Leonidas J. Guibas at Stanford.

My long-term research goal is to achieve human-level robust sensorimotor coordination in robotics. I am also very interested in 3D Vision and Animation. My previous research has primarily focused on dexterous manipulation from a semantic perspective.

Currently, I am thinking and exploring two key questions in manipulation:

What are the sources of knowledge for manipulation?

Shared information across datasets. The features of Cross-embodiment, cross-environment, and cross-quality make robotic datasets unique compared to data in other fields like vision and natural language. Defining a universal data format and unifying existing datasets, rather than solely collecting new ones, presents a promising approach to fundamentally addressing data scarcity in robotics. I am eager to explore the structure of shared motion primitives and semantics in these datasets and investigate how to integrate them to achieve semantic-aware and robust manipulation in the real world.
Shared foundations with scalable data sources. The vision domain offers rich semantic correspondences valuable for robotic perception, while natural language, as a natural carrier of reasoning and prompting, can enhance decision-making. I am excited to investigate the connections between robotics and scalable data sources by leveraging these shared foundations.

How can diverse sources of knowledge be effectively integrated?

Structured Policy Design. Current policies (e.g. in IL/RL) directly map perception to actions of specific end-effector, which(i) process complex information without prioritization and (ii) limit the available data sources. In contrast, humans first reason about interactions visually, then adapt during manipulation using closed-loop feedback from multiple modalities (e.g., tactile, acoustic). I am excited to explore the design of a structured manipulation policy, including how to integrate the end-effector agnostic action representation from diverse data sources and when to incorporate multi-modal perception and close-loop control.

Feel free (and please do!) to reach out if you have any questions, comments about my research, or anything you’d like to discuss or share with me!

selected publications

arXiv

KITE: Decoupling Kinematics and Interaction for Zero-Shot Cross-Embodiment Manipulation

Qianxu Wang, and Kuan Fang

arXiv preprint arXiv:2606.22113, 2026

HTML PDF
CoRL 2024

Neural Attention Field: Emerging Point Relevance in 3D Scenes for One-Shot Dexterous Grasping

Qianxu Wang, Congyue Deng, Tyler Lum, and 5 more authors

Conference on Robot Learning (CoRL), 2024

HTML PDF
ICLR 2024

SparseDFF: Sparse-View Feature Distillation for One-Shot Dexterous Manipulation

Qianxu Wang, Haotong Zhang, Congyue Deng, and 4 more authors

International Conference on Learning Representations (ICLR), 2024

HTML PDF