2023-06-06 11:48:49 +01:00
|
|
|
- [Feedforward](../Architectures.md)
|
2023-05-23 06:59:49 +01:00
|
|
|
- Single hidden layer can learn any function
|
|
|
|
- Universal approximation theorem
|
|
|
|
- Each hidden layer can operate as a different feature extraction layer
|
2023-06-06 11:48:49 +01:00
|
|
|
- Lots of [weights](../Weight%20Init.md) to learn
|
2023-06-05 17:01:29 +01:00
|
|
|
- [Back-Propagation](Back-Propagation.md) is supervised
|
2023-05-23 06:59:49 +01:00
|
|
|
|
2023-06-05 17:01:29 +01:00
|
|
|
![mlp-arch](../../../img/mlp-arch.png)
|
2023-05-23 06:59:49 +01:00
|
|
|
|
|
|
|
# Universal Approximation Theory
|
2023-06-06 11:48:49 +01:00
|
|
|
A finite [feedforward](../Architectures.md) MLP with 1 hidden layer can in theory approximate any mathematical function
|
|
|
|
- In practice not trainable with [BP](Back-Propagation.md)
|
2023-05-23 06:59:49 +01:00
|
|
|
|
2023-06-05 17:01:29 +01:00
|
|
|
![activation-function](../../../img/activation-function.png)
|
|
|
|
![mlp-arch-diagram](../../../img/mlp-arch-diagram.png)
|
2023-05-23 09:28:54 +01:00
|
|
|
## Weight Matrix
|
|
|
|
- Use matrix multiplication for layer output
|
|
|
|
- TLU is hard limiter
|
2023-06-05 17:01:29 +01:00
|
|
|
![tlu](../../../img/tlu.png)
|
2023-05-23 09:28:54 +01:00
|
|
|
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
|
2023-06-05 17:01:29 +01:00
|
|
|
![mlp-non-linear-decision](../../../img/mlp-non-linear-decision.png)
|
2023-06-06 11:48:49 +01:00
|
|
|
- Can generate a non-linear [decision boundary](Decision%20Boundary.md)
|