1. DeepLabv2

Untitled

2. PSPNet

Pyramid Scene Parsing Network

Motivation

Mismatched Relations
Confusion Categories
inconspicuous classes

Untitled

Difference between theoretical and real receptive field in FCN → hard to catch contextual information

Architecture

Untitled

Average pooling applied to feature map to produce sub-regions

Global Average Pooling

Untitled

Global Average Pooling VS Convolution
- Convolution outputs local context information on heat map while Global Average Pooling outputs global context information

3. DeepLab v3

Overall architecture is similar as DeepLabv2 but for the Atrous Spatial Pyramid Pooling

Untitled

New 1x1 convolution and image pooling
Instead of summation, DeepLabv3 concatenates heat maps

4. DeepLab v3+