VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild
A new dataset for video scene parsing.
251,632 pixel-level labeled frames
Pixel-level annotations are provided at 15 f/s
Over 96% videos are with high resolution from 720P to 4K
A complete shot lasting 5 seconds on average