The 1st Video Scene Parsing in the Wild Challenge Workshop

ICCV 2021, Montreal, Canada


Scene parsing is one of the main problems in computer vision, which aims at recognizing the semantics of each pixel in the given image. Recently, several image-based datasets have been collected to evaluate the effectiveness of scene parsing approaches. However, since the real-world is actually video-based rather than a static state, learning to perform video scene parsing is more reasonable and practical for realistic applications. Although remarkable progress has been made in image-based scene parsing, few works have been proposed to consider the video scene parsing, which is mainly limited by the lack of suitable benchmarks. To advance the scene parsing task from images to videos, we present a new dataset (VSPW dataset) and a competition in this workshop, aiming at performing the challenging yet practical Video Scene Parsing in the Wild (VSPW).

Video Scene Parsing aims to assign pre-defined semantic labels to pixels of all frames in a given video, which brings new challenges compared with image semantic segmentation. One main challenge of Video Scene Parsing task is how to leverage the temporal information for high predictive accuracy. We expect the challengers to provide results in terms of the accuracy better than image-based semantic segmentation methods.

Invited Speakers

Liang-Chieh (Jay) Chen

Research Scientist at Google AI

Raquel Urtasun

Professor at University of Toronto

Hengshuang Zhao

Postdoctoral Researcher at University of Oxford

Federico Perazzi

Research Scientist at Facebook

Call for Papers

We are soliciting high quality papers covering the topics listed below. Papers should follow the standard ICCV formatting instructions. Papers should be 4-8 pages in length (excluding references) formatted using the ICCV template. All the submissions should be anonymous. Accepted papers will appear in the IEEE/CVF proceedings. This workshop accepts both the challenge and regular papers. Any paper about the following topics is welcomed. So you can submit your paper without attending our challenge.

Topics of Interest

The topics of interest include (but are not limited to):

  • Semantic segmentation for images/videos

  • Video object/instance segmentation

  • Efficient computation for video scene parsing

  • Object tracking

  • Semi-supervised recognition in videos

  • New metrics to evaluate the quality of video scene parsing results

  • Real-world video applications, including autonomous driving, indoor robotics, visual navigation, etc.

Submission Deadline: July 25 23:59 PST

Author Notification: August 6 23:59 PST

Camera ready due: August 13 23:59 PST

Submission via CMT:


Welcome to participate the 1st Video Scene Parsing in the Wild Challenge. The challenge encourages participants to develop temporal methods for better segmentation performance. The competition will be run on CodaLab platform. Click on this link to access our challenge. The participants will need to register through the platform, where they will be able to submit their predictions on the test data and to obtain real-time feedback on the leaderboard. The development and test phases will open/close automatically based on the defined schedule. If you place in the top-3 at the end of the challenge you will submit a workshop paper to introduce your method and present your method in the workshop.

For details about the challenge dataset and rules, please refer to this link.

Important Dates:

May 20, 2021 Phase 1: Development Phase starts.

Aug. 5, 2021 Phase 2: Final Phase starts. Phase 1 is automatically closed.

Aug. 8, 2021 Phase 2 ends. Deadline for submitting the final predictions over the test data.


8:30 AM Chairs’ opening remarks

8:45 AM Raquel Urtasun, University of Toronto

9:15 AM ‪Liang-Chieh (Jay) Chen, Google AI

9:45 AM Hengshuang Zhao, University of Oxford

10:15 AM Break

10:30 AM Federico Perazzi, Facebook

11:00 AM Challenge 1st place Winners’ Oral Presentation

11:15 AM Challenge 2nd place Winners’ Oral Presentation

11:30 AM Challenge 3rd place Winners’ Oral Presentation

11:45 AM Award ceremony and concluding remarks


Yunchao Wei

University of Technology Sydney

Jiaxu Miao

University of Technology Sydney

Yu Wu

University of Technology Sydney

Yi Yang

University of Technology Sydney

Si Liu

Beihang University

Zhu Yi


Elisa Ricci

University of Trento

Cees Snoek

University of Amsterdam