The 2nd Pixel-level Video Understanding in the Wild Challenge Workshop

18 June, 2023

CVPR 2023, VANCOUVER

Introduction

Pixel-level Scene Understanding is one of the fundamental problems in computer vision, which aims at recognizing object classes, masks and semantics of each pixel in the given image. Since the real-world is actually video-based rather than a static state, learning to perform video semantic/panoptic segmentation is more reasonable and practical for realistic applications. To advance the semantic/panoptic segmentation task from images to videos, we present two large-scale datasets (VSPW[1] and VIPSeg[2]) and a competition in this workshop, aiming at performing the challenging yet practical Pixel-level Video Understanding in the Wild (PVUW).  This workshop includes workshop papers

This workshop will cover but not limit to the following topics: 

● Semantic/panoptic segmentation for images/videos 

● Video object/instance segmentation

● Efficient computation for video scene parsing 

● Object tracking 

● Semi-supervised recognition in videos 

● New metrics to evaluate the quality of video scene parsing results 

● Real-world video applications, including autonomous driving, indoor robotics, visual navigation, etc.


[1] VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild. CVPR 2021

[2] Large-scale Video Panoptic Segmentation in the Wild: A Benchmark. CVPR 2022

Challenges

Pixel-level Video Understanding in the Wild Challenge (PVUW) challenge includes two tracks, the video semantic segmentation track and the video panoptic segmentation track. 

Track 1: Video Semantic Segmentation (VSS) Track 

The video semantic segmentation task aims to recognize the semantics of all frames in a given video.  To participant Track 1, please visit this link.

Track 2: Video Panoptic Segmentation (VPS) Track

 The video panoptic segmentation task aims to jointly predict object classes, bounding boxes, masks, instance id tracking, and semantic segmentation in video frames. To participant Track 2, please visit this link.


Important Dates: 


Call for Papers

Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) at this CMT link.

Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.

Important Dates: 


Invited Speakers

Hengshuang Zhao

Assistant Professor

The University of Hong Kong

Yuhui Yuan

Senior Researcher

Microsoft Research Asia

Workshop Schedule

18 June 1:30 PM (PT time)   Chairs’ opening remarks

18 June 1:45 PM (PT time)   Yuhui Yuan, Microsoft Research Asia

18 June  2:20 PM (PT time)  Hengshuang Zhao, The University of Hong Kong

18 June  2:55 PM (PT time) Challenge 1st place Winners’ Oral Presentation (VSS Track)  

18 June  3:10 PM (PT time) Break

18 June  3:25 PM (PT time) Challenge 2st place Winners’ Oral Presentation (VSS Track)  

18 June  3:40 PM (PT time) Challenge 3st place Winners’ Oral Presentation (VSS Track)  

18 June  3:55 PM (PT time) Challenge 1st place Winners’ Oral Presentation (VPS Track)  

18 June  4:10 PM (PT time) Challenge 2st place Winners’ Oral Presentation (VPS Track)  

18 June  4:25 PM (PT time) Challenge 3st place Winners’ Oral Presentation (VPS Track)  

18 June  4:40 PM (PT time) Workshop Paper Hidetomo SAKAINO

Organizers

Jiaxu Miao

Zhejiang University

Zongxin Yang

Zhejiang University

Yunchao Wei

University of Technology Sydney

 Yi Yang

Zhejiang University

Si Liu

Beihang University

 Zhu Yi 

Amazon

Elisa Ricci

University of Trento

Cees Snoek

University of Amsterdam