The 2nd Pixel-level Video Understanding in the Wild Challenge Workshop

18 June, 2023

CVPR 2023, VANCOUVER

Introduction

Pixel-level Scene Understanding is one of the fundamental problems in computer vision, which aims at recognizing object classes, masks and semantics of each pixel in the given image. Since the real-world is actually video-based rather than a static state, learning to perform video semantic/panoptic segmentation is more reasonable and practical for realistic applications. To advance the semantic/panoptic segmentation task from images to videos, we present two large-scale datasets (VSPW[1] and VIPSeg[2]) and a competition in this workshop, aiming at performing the challenging yet practical Pixel-level Video Understanding in the Wild (PVUW). This workshop includes workshop papers

This workshop will cover but not limit to the following topics:

● Semantic/panoptic segmentation for images/videos

● Video object/instance segmentation

● Efficient computation for video scene parsing

● Object tracking

● Semi-supervised recognition in videos

● New metrics to evaluate the quality of video scene parsing results

● Real-world video applications, including autonomous driving, indoor robotics, visual navigation, etc.

[1] VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild. CVPR 2021

[2] Large-scale Video Panoptic Segmentation in the Wild: A Benchmark. CVPR 2022

Challenges

Pixel-level Video Understanding in the Wild Challenge (PVUW) challenge includes two tracks, the video semantic segmentation track and the video panoptic segmentation track.

Track 1: Video Semantic Segmentation (VSS) Track

The video semantic segmentation task aims to recognize the semantics of all frames in a given video. To participant Track 1, please visit this link.

Track 2: Video Panoptic Segmentation (VPS) Track

The video panoptic segmentation task aims to jointly predict object classes, bounding boxes, masks, instance id tracking, and semantic segmentation in video frames. To participant Track 2, please visit this link.

Important Dates:

Jan 20th: Codalab websites open for registration. Training, validation and test data are released.
May 15th - 25th: Open the submission of the final test results.
May 30th: The final competition results will be announced and top teams will be invited to give oral/poster presentations at our CVPR 2023 workshop.

Call for Papers

Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) at this CMT link.

Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.