Benefit of DNN approximators
- Use distributed representations
- Universal function approximator
- Can potentially need less params to represent same function (compared to shallow net)
- Learn by SGD

1. Convolution Neural Nets(CNNs)

1-1. Fully connected neural net

Space-time complexity $\uparrow$
structure information $\downarrow$
locality of info $\downarrow$

1-2. Convolution NN

Receptive field

2. Deep Q Learning

Using FA to help scale up to making decisions in large domains
DNN to represent state-action value function

RECALL :

Untitled

2-1. DQNs in Atari

end to end learning of Q(s,a) from pixel s
- s : stack of pixels from last 4 frames
- output is Q(s,a) for 18 states
- Reward is change in score for that step
- Same architecture and hyperparameters across all games

Untitled

<aside> 👩🏼‍🏫 Learn different Q functions, different policies for each game, but the point is that they didn’t use totally different architectures, hyperparameters every single game in order to get it work. General architecture, hyperparmeters is sufficient!

</aside>

Problems with Q-Learning with VFA
- 샘플간의 상관관계
- non-stationary targets → no oracle to tell, changing by time
DQN solves these by
- Experience replay
- Fixed Q-targets