05

Convolution¶

We can compute the spatial size of the output volume as a function of the input volume size ($W$), the receptive field size of the Conv Layer neurons ($F$), the stride with which they are applied ($S$), and the amount of zero padding used ($P$) on the border. You can convince yourself that the correct formula for calculating how many neurons “fit” is given by $(W−F+2P)/S+1$.

Pooling¶

Accepts a volume of size $W_1 \times H_1 \times D_1$
Requires two hyperparameters:
their spatial extent $F$,
the stride $S$,
Produces a volume of size $W_2 \times H_2 \times D_2 $

where:

$W_2 = (W_1 - F)/S + 1$
$H_2 = (H_1 - F)/S + 1$
$D_2 = D_1$

Normalization¶

Fully-Connected¶

经典框架¶

LeNet：90 年代的卷积神经网络

AlexNet：更深的卷积神经网络

GoogLeNet：采用了 Inception Module

VGG：深度达到了 19 层，但训练的稳定性不是很好

ResNet：通过 skip connection 和残差模块学习残差，网络更深，分类效果更好

卷积神经网络可视化¶

权值可视化

遮挡

梯度上升

迁移学习¶

新数据集很小，但和原数据集相似度高¶

训练全连接分类器

新数据集很大，和原数据集相似度高¶

可以考虑训练整个网络

新数据集很小，并且和原数据集相似度低¶

还是单独训练一个 SVM 之类的吧

新数据集很大，但和原数据集相似度低¶

完全可以从头开始训练，但是从预训练模型开始训练是比较好的选择

最后更新: 2023-01-31