18 lines
424 B
Markdown
18 lines
424 B
Markdown
|
- Randomly
|
||
|
- Gaussian noise with mean = 0
|
||
|
- Small network
|
||
|
- Fixed sigma is fine
|
||
|
- 0.01
|
||
|
- E.g. 8 layers
|
||
|
- AlexNet
|
||
|
- Too large
|
||
|
- Wont converge
|
||
|
- Too small
|
||
|
- Gradient wont propagate back many layers
|
||
|
|
||
|
## Xavier System
|
||
|
$$\sigma=\frac 1 {n_{in}+n_{out}}$$
|
||
|
or
|
||
|
$$\sigma=\sqrt{2/n}$$
|
||
|
* Where $n=\text{filter size}\times n_{out}$
|
||
|
* And $n_{in}$ and $n_{out}$ refer to number of image channels in and out of the layer
|