โ†’ Generalization problems in large state, state-action spaces

$$ v_{\pi}(s) \approx \hat{v}(s;w) $$

Untitled

$$ q_{\pi}(s,a) \approx \hat{q}(s,a;w) $$

Untitled

<aside> ๐Ÿ‘ฉ๐Ÿผโ€๐Ÿซ Trade off between representation capacity VS memory,computation, experience

</aside>

โ†’ function approximator ํ‘œํ˜„๋ ฅ $\uparrow$ ,์ด์— ํ•„์š”ํ•œ ๋ฉ”๋ชจ๋ฆฌ, ๊ณ„์‚ฐ๋Ÿ‰, ๋ฐ์ดํ„ฐ $\uparrow$, ์ •ํ™•๋„ $\uparrow$

1. Linear Feature representation

VFA for Policy estimation with oracle

Model free VFA Policy estimation

Feature vector

Untitled

<aside> ๐Ÿ‘ฉ๐Ÿผโ€๐Ÿซ State representation for which there is partial aliasing, itโ€™s not markov

</aside>

Linear VFA for prediction with oracle

Untitled