Tour in Picture
using a spidery mesh interface to make animation from a single image

TUM CV Challenge SS24


Wenbo Ji, Xiang Ji, Hongru Li, Yuming Li, Shilin Zhang

Group 31
Technical University of Munich

Motivation


Based on "Tour into the Picture" (TIP)[2] approach, we aim to develop autonomous algorithms that infer two key structures from a single 2D image: the regular, program-like textures or patterns on 2D planes and the 3D positioning of these planes within the scene.

For example, from a single Metro Station image below, we can infer the camera pose, partition the image into distinct planes (walls, floor, ceiling, and far plane), and recognize repeated patterns.

This method enables flexible image editing, such as inpainting, moving the camera, and extending the Image Space, which requires a deep understanding of the scene's 3D structure and real-time rendering of spatially consistent views.


Introduction


The project aims to develop a graphical user interface (GUI) that allows users to extract a simple scene model from a single 2D image, facilitating easy animation and scene manipulation.

With our GUI, users can intuitively distinguish between foreground and background objects. The background geometry is approximated using simple polygons, forming a polyhedral model with the vanishing point at its base.

Specifying the vanishing point is also user-driven, ensuring that the virtual vanishing point aligns with the user's perception.

Finally, users can determine the proximity of objects in the scene, effectively setting camera parameters to position foreground objects as desired.


Method


Overview of Our Method: Step 1: Data Selection, Step 2: Image Decomposition, Step 3: Fitting Perspective Projection, and Step 4: Camera Positioning.





Experimental Results


Costume Data - Chilli room


Complex sipmle - Simple room


Complex middle - Museum


Complex middle - Shopping mall


Reference


  
    [1] Zhiqing Cao, Xin Sun, and Jiaoying Shi.
    Tour into the picture using relative depth calculation.
    In Proceedings of the 2004 ACM SIGGRAPH international conference on Virtual Reality continuum and its applications in industry, pages
    38–44, 2004.
    [2] Youichi Horry, Ken‐Ichi Anjyo, and Kiyoshi Arai.
    Tour into the picture: using a spidery mesh interface to make animation from a single image.
    In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’97, page 225–232, USA,
    1997. ACM Press/Addison‐Wesley Publishing Co.
    [3] Jian Liu, Kuangrong Hao, Huan Liu, and Yongsheng Ding.
    An improved algorithm based on tip using a vanishing line.
    In 2013 IEEE Third International Conference on Information Science and Technology (ICIST), pages 546–549. IEEE, 2013.
    [4] Guihang Wang, Xuejin Chen, and Si Chen.
    Cut‐and‐fold: Automatic 3d modeling from a single image.
    In 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), pages 1–6, 2014.