<aside> 🧑‍🏫 Using Global Average Pooling instead of FC layers of CNNs allow us to use Class Activation Maps for discriminative localization!

</aside>

Untitled

Class Activation Mapping

  1. GAP outputs spatial average of feature map of each unit
  2. weighted sum of 1 is used to generate output
  1. last Conv layer ouputs feature maps
  2. weighted sum of 1 is used to generate CAM

$f_k(x,y)$ → activation of unit $k$ in the last conv layer at $(x,y)$

$F^k : \sum_{x,y}f_k(x,y)$ → GAP for unit $k$

$S_c = \sum_k w^c_kF_k$ → input to the softmax for class $c$

→ $w^c_k$ indicates importnce of $F_k$ for class $c$

$P_c$ → softmax of $Sc$

Untitled

Untitled

Untitled