2023-05-27 00:50:46 +01:00
|
|
|
- [[Architectures|Feedforward]]
|
2023-05-23 06:59:49 +01:00
|
|
|
- Single hidden layer can learn any function
|
|
|
|
- Universal approximation theorem
|
|
|
|
- Each hidden layer can operate as a different feature extraction layer
|
2023-05-27 22:17:56 +01:00
|
|
|
- Lots of [[Weight Init|weights]] to learn
|
2023-05-23 09:11:59 +01:00
|
|
|
- [[Back-Propagation]] is supervised
|
2023-05-23 06:59:49 +01:00
|
|
|
|
|
|
|
![[mlp-arch.png]]
|
|
|
|
|
|
|
|
# Universal Approximation Theory
|
2023-05-27 00:50:46 +01:00
|
|
|
A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory approximate any mathematical function
|
2023-05-23 09:11:59 +01:00
|
|
|
- In practice not trainable with [[Back-Propagation|BP]]
|
2023-05-23 06:59:49 +01:00
|
|
|
|
|
|
|
![[activation-function.png]]
|
2023-05-23 09:28:54 +01:00
|
|
|
![[mlp-arch-diagram.png]]
|
|
|
|
## Weight Matrix
|
|
|
|
- Use matrix multiplication for layer output
|
|
|
|
- TLU is hard limiter
|
|
|
|
![[tlu.png]]
|
|
|
|
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
|
|
|
|
![[mlp-non-linear-decision.png]]
|
2023-05-23 17:05:48 +01:00
|
|
|
- Can generate a non-linear [[Decision Boundary|decision boundary]]
|