1. Peak signal-to-noise ratio (PSNR)
- Commonly used to quantify reconstruction quality for images and video subject to lossy compression.
$$
PSNR = 10 \cdot\log_{10}\begin{pmatrix} \frac{MAX^{2}_I} {MSE} \end{pmatrix}
$$
- A ratio between
- $MAX$: maximum possible power of signal
- Power of corrupting noise (easily defined via MSE) that effects the fidelity of the image’s representation
$$
MSE = \frac{1}{mn}\sum^{m-1}{i=0}\sum^{n-1}{j=0}[I(i,j)-K(i,j)]^2
$$
$I$ : image
$K$: noisy approximation
2. Structural similarity index (SSIM)
- SSIM papers
- Used to compute the perceptual distance between translated image and its gt
- $\uparrow$ similarity of luminance, contrast and structure of images → $\uparrow$SSIM
$$
SSIM(x,y) = \Bigl( l(x,y)\Bigr)^{\alpha} \Bigl( c(x,y)\Bigr)^{\beta} \Bigl( s(x,y)\Bigr)^{\gamma}
$$
3. Inception Score(IS)
- IS papers
- Used to assess the quality of images created by a generative model
- $\uparrow$IS → $p_{gen}$ is a “sharp and distinct” collection of images → $\uparrow$diversity, fidelity
$$
\text{IS} = \exp(\mathbb{E}{x \sim p_g} D{KL}(p(y|x)||p(y)) )
$$
$$
D_{KL}(p(y|x)||p(y))=p(y|x)\log \begin{pmatrix} \frac{p(y|x)}{p(y)} \end{pmatrix}
$$
- Inception Model Classification

- Fidelity
- High probability for a few select classes, low probability for remaining classes
- $p(y|x)$ → conditional distribution
- Low entropy