andy
33ac3007bc
Affected files: .obsidian/graph.json .obsidian/workspace-mobile.json .obsidian/workspace.json STEM/AI/Neural Networks/Activation Functions.md STEM/AI/Neural Networks/CNN/FCN/FlowNet.md STEM/AI/Neural Networks/CNN/FCN/ResNet.md STEM/AI/Neural Networks/CNN/FCN/Skip Connections.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/Interpretation.md STEM/AI/Neural Networks/Deep Learning.md STEM/AI/Neural Networks/MLP/Back-Propagation.md STEM/AI/Neural Networks/MLP/MLP.md STEM/AI/Neural Networks/Transformers/Attention.md STEM/CS/ABI.md STEM/CS/Calling Conventions.md STEM/CS/Code Types.md STEM/CS/Language Binding.md STEM/img/am-regulariser.png STEM/img/skip-connections.png
22 lines
837 B
Markdown
22 lines
837 B
Markdown
- [[Architectures|Feedforward]]
|
|
- Single hidden layer can learn any function
|
|
- Universal approximation theorem
|
|
- Each hidden layer can operate as a different feature extraction layer
|
|
- Lots of [[Weight Init|weights]] to learn
|
|
- [[Back-Propagation]] is supervised
|
|
|
|
![[mlp-arch.png]]
|
|
|
|
# Universal Approximation Theory
|
|
A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory approximate any mathematical function
|
|
- In practice not trainable with [[Back-Propagation|BP]]
|
|
|
|
![[activation-function.png]]
|
|
![[mlp-arch-diagram.png]]
|
|
## Weight Matrix
|
|
- Use matrix multiplication for layer output
|
|
- TLU is hard limiter
|
|
![[tlu.png]]
|
|
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
|
|
![[mlp-non-linear-decision.png]]
|
|
- Can generate a non-linear [[Decision Boundary|decision boundary]] |