vault backup: 2023-05-27 00:50:46

Affected files: .obsidian/graph.json .obsidian/workspace-mobile.json .obsidian/workspace.json STEM/AI/Neural Networks/Architectures.md STEM/AI/Neural Networks/CNN/CNN.md STEM/AI/Neural Networks/CNN/Examples.md STEM/AI/Neural Networks/CNN/FCN/FCN.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/Interpretation.md STEM/AI/Neural Networks/Deep Learning.md STEM/AI/Neural Networks/MLP/MLP.md STEM/AI/Neural Networks/SLP/Least Mean Square.md STEM/AI/Neural Networks/Transformers/Attention.md STEM/AI/Neural Networks/Transformers/Transformers.md STEM/img/feedforward.png STEM/img/multilayerfeedforward.png STEM/img/recurrent.png STEM/img/recurrentwithhn.png
2023-05-27 00:50:46 +01:00 · 2023-05-27 00:50:46 +01:00 · acb7dc429e
commit acb7dc429e
parent 7052c8c915
16 changed files with 45 additions and 22 deletions
--- a/Networks/Architectures.md
+++ b/Networks/Architectures.md
@ -0,0 +1,23 @@
 # Single-Layer Feedforward
 - *Acyclic*
 - Count output layer, no computation at input
 ![[feedforward.png]]
 # Multilayer Feedforward
 - Hidden layers
 	- Extract higher-order statistics
 	- Global perspective
 	- Helpful with large input layer
 - Fully connected
 	- Every neuron is connected to every neuron adjacent layers
 - Below is a 10-4-2 network
 ![[multilayerfeedforward.png]]
 # Recurrent
 - At least one feedback loop
 - Below has no self-feedback
 ![[recurrent.png]]
 ![[recurrentwithhn.png]]
 - Above has hidden neurons
--- a/Networks/CNN/CNN.md
+++ b/Networks/CNN/CNN.md
@ -14,14 +14,14 @@
 	- Double digit % gain on ImageNet accuracy
 # Full Connected
-Dense
+[[MLP|Dense]]
 - Move from convolutional operations towards vector output
 - Stochastic drop-out
-	- Sub-sample channels and only connect some to dense layers
+	- Sub-sample channels and only connect some to [[MLP|dense]] layers
 # As a Descriptor
 - Most powerful as a deeply learned feature extractor
- Dense classifier at the end isn't fantastic
+- [[MLP|Dense]] classifier at the end isn't fantastic
 	- Use SVM to classify prior to penultimate layer
 ![[cnn-descriptor.png]]
@ -42,13 +42,13 @@ Dense
 ![[fine-tuning-freezing.png]]
 # Training
- Validation & training loss
+- Validation & training [[Deep Learning#Loss Function|loss]]
 - Early
 	- Under-fitting
 	- Training not representative
 - Later
 	- Overfitting
- V.loss can help adjust learning rate
+- V.[[Deep Learning#Loss Function|loss]] can help adjust learning rate
 	- Or indicate when to stop training
 ![[under-over-fitting.png]]
--- a/Networks/CNN/Examples.md
+++ b/Networks/CNN/Examples.md
@ -29,13 +29,13 @@
 2015
 - [[Inception Layer]]s
- Multiple Loss Functions
+- Multiple [[Deep Learning#Loss Function|Loss]] Functions
 ![[googlenet.png]]
 ## [[Inception Layer]]
 ![[googlenet-inception.png]]
-## Auxiliary Loss Functions
+## Auxiliary [[Deep Learning#Loss Function|Loss]] Functions
 - Two other SoftMax blocks
 - Help train really deep network
 	- Vanishing gradient problem
--- a/Networks/CNN/FCN/FCN.md
+++ b/Networks/CNN/FCN/FCN.md
@ -20,13 +20,13 @@ Contractive → [[UpConv]]
 - Rarely from scratch
 - Pre-trained weights
 - Replace final layers
-	- FC layers
+	- [[MLP|FC]] layers
 	- White-noise initialised
 - Add [[upconv]] layer(s)
 	- Fine-tune train
 	- Freeze others
 	- Annotated GT images
- Can use summed per-pixel log loss
+- Can use summed per-pixel log [[Deep Learning#Loss Function|loss]]
 # Evaluation
 ![[fcn-eval.png]]
--- a/Networks/CNN/GAN/DC-GAN.md
+++ b/Networks/CNN/GAN/DC-GAN.md
@ -12,11 +12,11 @@ Deep Convolutional [[GAN]]
 	- Train using Gaussian random noise for code
 - Discriminator
 	- Contractive
-	- Cross-entropy loss
+	- Cross-entropy [[Deep Learning#Loss Function|loss]]
 	- Conv and leaky [[Activation Functions#ReLu|ReLu]] layers only
-	- Normalised output via sigmoid
+	- Normalised output via [[Activation Functions#Sigmoid|sigmoid]]
-## Loss
+## [[Deep Learning#Loss Function|Loss]]
 $$D(S,L)=-\sum_iL_ilog(S_i)$$
 - $S$
 	- $(0.1, 0.9)^T$
--- a/Networks/CNN/GAN/GAN.md
+++ b/Networks/CNN/GAN/GAN.md
@ -1,7 +1,7 @@
 # Fully Convolutional
 - Remove [[Max Pooling]]
 	- Use strided [[upconv]]
- Remove FC layers
+- Remove [[MLP|FC]] layers
 	- Hurts convergence in non-classification
 - Normalisation tricks
 	- Batch normalisation
--- a/Networks/CNN/Interpretation.md
+++ b/Networks/CNN/Interpretation.md
@ -6,8 +6,8 @@
 ![[am.png]]
 - **Use trained network**
 	- Don't update weights
- Feedforward noise
+- [[Architectures|Feedforward]] noise
-	- [[Back-Propagation|Back-propagate]] loss
+	- [[Back-Propagation|Back-propagate]] [[Deep Learning#Loss Function|loss]]
 		- Don't update weights
 		- Update image
@ -17,4 +17,4 @@
 - Prone to high frequency noise
 	- Minimise
 - Total variation
-	- $x^*$ is the best solution to minimise loss
+	- $x^*$ is the best solution to minimise [[Deep Learning#Loss Function|loss]]
--- a/Networks/Deep
+++ b/Networks/Deep
@ -8,7 +8,7 @@ Objective Function
 ![[deep-loss-function.png]]
 - Test accuracy worse than train accuracy = overfitting
- Dense = fully connected
+- [[MLP|Dense]] = [[MLP|fully connected]]
 - Automates feature engineering
 ![[ml-dl.png]]
--- a/Networks/MLP/MLP.md
+++ b/Networks/MLP/MLP.md
@ -1,4 +1,4 @@
-   Feed-forward
+-   [[Architectures|Feedforward]]
 -   Single hidden layer can learn any function
 	-   Universal approximation theorem
 -   Each hidden layer can operate as a different feature extraction layer
@ -8,7 +8,7 @@
 ![[mlp-arch.png]]
 # Universal Approximation Theory
-A finite feed-forward MLP with 1 hidden layer can in theory approximate any mathematical function
+A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory approximate any mathematical function
 -   In practice not trainable with [[Back-Propagation|BP]]
 ![[activation-function.png]]
--- a/Networks/SLP/Least
+++ b/Networks/SLP/Least
@ -20,7 +20,7 @@ $$\frac{\partial \mathfrak{E}(w)}{\partial w(n)}=-x(n)\cdot e(n)$$
 $$\hat{g}(n)=-x(n)\cdot e(n)$$
 $$\hat{w}(n+1)=\hat{w}(n)+\eta \cdot x(n) \cdot e(n)$$
-   Above is a feedback loop around weight vector, $\hat{w}$
+-   Above is a [[Architectures|feedforward]] loop around weight vector, $\hat{w}$
 	-   Behaves like low-pass filter
 		-   Pass low frequency components of error signal
 	-   Average time constant of filtering action inversely proportional to learning-rate
--- a/Networks/Transformers/Attention.md
+++ b/Networks/Transformers/Attention.md
@ -11,7 +11,7 @@
 - Attention layer access all previous states and weighs according to learned measure of relevance
 	- Allows referring arbitrarily far back to relevant tokens
 - Can be addd to [[RNN]]s
- In 2016, a new type of highly parallelisable _decomposable attention_ was successfully combined with a feedforward network
+- In 2016, a new type of highly parallelisable _decomposable attention_ was successfully combined with a [[Architectures|feedforward]] network
 	- Attention useful in of itself, not just with [[RNN]]s
 - [[Transformers]] use attention without recurrent connections
 	- Process all tokens simultaneously
--- a/Networks/Transformers/Transformers.md
+++ b/Networks/Transformers/Transformers.md
@ -35,5 +35,5 @@
 	- Uses incorporated textual information to produce output
 	- Has attention to draw information from output of previous decoders before drawing from encoders
 - Both use [[attention]]
- Both use dense layers for additional processing of outputs
+- Both use [[MLP|dense]] layers for additional processing of outputs
 	- Contain residual connections & layer norm steps
--- a/img/feedforward.png
+++ b/img/feedforward.png
--- a/img/multilayerfeedforward.png
+++ b/img/multilayerfeedforward.png
--- a/img/recurrent.png
+++ b/img/recurrent.png
--- a/img/recurrentwithhn.png
+++ b/img/recurrentwithhn.png