andy
33ac3007bc
Affected files: .obsidian/graph.json .obsidian/workspace-mobile.json .obsidian/workspace.json STEM/AI/Neural Networks/Activation Functions.md STEM/AI/Neural Networks/CNN/FCN/FlowNet.md STEM/AI/Neural Networks/CNN/FCN/ResNet.md STEM/AI/Neural Networks/CNN/FCN/Skip Connections.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/Interpretation.md STEM/AI/Neural Networks/Deep Learning.md STEM/AI/Neural Networks/MLP/Back-Propagation.md STEM/AI/Neural Networks/MLP/MLP.md STEM/AI/Neural Networks/Transformers/Attention.md STEM/CS/ABI.md STEM/CS/Calling Conventions.md STEM/CS/Code Types.md STEM/CS/Language Binding.md STEM/img/am-regulariser.png STEM/img/skip-connections.png
26 lines
632 B
Markdown
26 lines
632 B
Markdown
- Residual networks
|
|
- 152 layers
|
|
- Skips every two layers
|
|
- Residual block
|
|
- Later layers learning the identity function
|
|
- Skips help
|
|
- Deep network should be at least as good as shallower one by allowing some layers to do very little
|
|
- Vanishing gradient
|
|
- Allows shortcut paths for gradients
|
|
- Accuracy saturation
|
|
- Adding more layers to suitably deep network increases training error
|
|
|
|
# Design
|
|
|
|
- Skips across pairs of conv layers
|
|
- Elementwise addition
|
|
- All layer 3x3 kernel
|
|
- Spatial size halves each layer
|
|
- Filters doubles each layer
|
|
- Fully convolutional
|
|
- No fc layer
|
|
- No pooling
|
|
- Except at end
|
|
- No dropout
|
|
|