<aside> 🧑🏫 Layer-wise Relevance Propagation brings explainability that can scale up to complex networks by propagating prediction backward using purposely designed propagation rules
</aside>
Spurious correlations between data lead to bad learning
😕How can we prevent this?
→ Propagating prediction $f(x)$ backward in the neural network by designed rules
$z_{jk}$ → models the extent to which neuron $j$ has contributed to make neuron $k$ relevant
✅ 뒤(output에 가까운)에서 앞(input에 가까운)으로 propagate하는 동안 $\sum_jz_{jk}$ 로 normalize하기 때문에 layer-wise conservation/Global conservation property를 갖는다
$$ R_j = \sum_k \frac{z_{jk}}{\sum_jz_{jk}}R_k $$

Deep rectifier network neurons
$$ a_k = max\bigg(0,\sum_{0,j}a_jw_{jk}\bigg) $$
$$ R_j = \sum_k \frac{a_jw_{jk}}{\sum_{0,j}a_jw_{jk}}R_k $$
$$ R_j = \sum_k \frac{a_jw_{jk}}{\epsilon + \sum_{0,j}a_jw_{jk}}R_k $$