Representing value functions by a lookup table(tabular representation)
- state → $V(s)$
- state-action pair → $Q(s,a)$

→ Generalization problems in large state, state-action spaces

Prefer learning approximations
- w : parameter or weights of function approximator

$$ v_{\pi}(s) \approx \hat{v}(s;w) $$

Untitled

$$ q_{\pi}(s,a) \approx \hat{q}(s,a;w) $$

Untitled

<aside> 👩🏼‍🏫 Trade off between representation capacity VS memory,computation, experience

</aside>

→ function approximator 표현력 $\uparrow$ ,이에 필요한 메모리, 계산량, 데이터 $\uparrow$, 정확도 $\uparrow$

1. Linear Feature representation

VFA for Policy estimation with oracle

Assume we know $V^{\pi}(s)$ for all $s$ → want to fit parameterized function to represent all the data accurately
- MSE as loss between $V^{\pi}(s) ,\hat{V}(s;w)$
- GD or SGD

Don’t have access to $V_{\pi}(s)$
정해진 policy 혹은 데이터를 통해 Monte Carlo Methods, TD methods 를 이용해 $V_{\pi}(s), Q_{\pi}(s,a)$를 추정함
VFA에서는 $V_{\pi}(s), Q_{\pi}(s,a)$를 update 할때 function approximator 도 update 함

Untitled

<aside> 👩🏼‍🏫 State representation for which there is partial aliasing, it’s not markov

</aside>

Untitled