Fusheng-Ji · 浮生记

Tagged “#3d-vision”

On 3D, Video World Models, and the Approaching ImageNet Moment of Perception-Action Learning

How can perception-action learning reach its own ImageNet moment, and what role might 3D and video world models play in building causal, physically accurate representations of the world that are useful for action?

30 Mar 2026 · 7 min read · Wenbo Ji