Part 1

Activation Functions

Data preprocessing

Wight Initialization

Batch Normalization

Babysitting the Learning Process

Hyperparameter Optimization

Part 2

Fancier optimization

Regularization

Transfer Learning