vault backup: 2023-05-26 18:29:17

Affected files: .obsidian/graph.json .obsidian/workspace-mobile.json .obsidian/workspace.json STEM/AI/Neural Networks/Activation Functions.md STEM/AI/Neural Networks/CNN/CNN.md STEM/AI/Neural Networks/CNN/Convolutional Layer.md STEM/AI/Neural Networks/CNN/Examples.md STEM/AI/Neural Networks/CNN/GAN/CycleGAN.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/GAN/StackGAN.md STEM/AI/Neural Networks/CNN/GAN/cGAN.md STEM/AI/Neural Networks/CNN/Inception Layer.md STEM/AI/Neural Networks/CNN/Max Pooling.md STEM/AI/Neural Networks/CNN/Normalisation.md STEM/AI/Neural Networks/CV/Data Manipulations.md STEM/AI/Neural Networks/CV/Datasets.md STEM/AI/Neural Networks/CV/Filters.md STEM/AI/Neural Networks/CV/Layer Structure.md STEM/AI/Neural Networks/Weight Init.md STEM/img/alexnet.png STEM/img/cgan-example.png STEM/img/cgan.png STEM/img/cnn-cv-layer-arch.png STEM/img/cnn-descriptor.png STEM/img/cnn-normalisation.png STEM/img/code-vector-math-for-control-results.png STEM/img/cvmfc.png STEM/img/cyclegan-results.png STEM/img/cyclegan.png STEM/img/data-aug.png STEM/img/data-whitening.png STEM/img/dc-gan.png STEM/img/fine-tuning-freezing.png STEM/img/gabor.png STEM/img/gan-arch.png STEM/img/gan-arch2.png STEM/img/gan-results.png STEM/img/gan-training-discriminator.png STEM/img/gan-training-generator.png STEM/img/googlenet-auxilliary-loss.png STEM/img/googlenet-inception.png STEM/img/googlenet.png STEM/img/icv-pos-neg-examples.png STEM/img/icv-results.png STEM/img/inception-layer-arch.png STEM/img/inception-layer-effect.png STEM/img/lenet-1989.png STEM/img/lenet-1998.png STEM/img/max-pooling.png STEM/img/stackgan-results.png STEM/img/stackgan.png STEM/img/under-over-fitting.png STEM/img/vgg-arch.png STEM/img/vgg-spec.png STEM/img/word2vec.png
2023-05-26 18:29:17 +01:00 · 2023-05-26 18:29:17 +01:00 · 8f0b604256
commit 8f0b604256
parent 5a592c8c7c
53 changed files with 385 additions and 0 deletions
--- a/Networks/Activation
+++ b/Networks/Activation
@ -52,5 +52,17 @@ $$\frac{dy}{dx}=
 Rectilinear
 - For deep networks
 - $y=max(0,x)$
 - CNNs
 	- Breaks associativity of successive convolutions
 		- Critical for learning complex functions
 	- Sometimes small scalar for negative
 		- Leaky ReLu
 ![[relu.png]]
 # SoftMax
 - Output is per-class vector of likelihoods
 	- Should be normalised into probability vector
 ## AlexNet
 $$f(x_i)=\frac{\text{exp}(x_i)}{\sum_{j=1}^{1000}\text{exp}(x_j)}$$
--- a/Networks/CNN/CNN.md
+++ b/Networks/CNN/CNN.md
@ -0,0 +1,54 @@
 ## Before 2010s
 - Data hungry
 	- Need lots of training data
 - Processing power
 - Niche
 	- No-one cared/knew about CNNs
 ## After
 - ImageNet
 	- 16m images, 1000 classes
 - GPUs
 	- General processing GPUs
 	- CUDA
 - NIPS/ECCV 2012
 	- Double digit % gain on ImageNet accuracy
 # Full Connected
 Dense
 - Move from convolutional operations towards vector output
 - Stochastic drop-out
 	- Sub-sample channels and only connect some to dense layers
 # As a Descriptor
 - Most powerful as a deeply learned feature extractor
 - Dense classifier at the end isn't fantastic
 	- Use SVM to classify prior to penultimate layer
 ![[cnn-descriptor.png]]
 # Finetuning
 - Observations
 	- Most CNNs have similar weights in conv1
 	- Most useful CNNs have several conv layers
 		- Many weights
 		- Lots of training data
 	- Training data is hard to get
 		- Labelling
 - Reuse weights from other network
 - Freeze weights in first 3-5 conv layers
 	- Learning rate = 0
 	- Randomly initialise remaining layers
 	- Continue with existing weights
 ![[fine-tuning-freezing.png]]
 # Training
 - Validation & training loss
 - Early
 	- Under-fitting
 	- Training not representative
 - Later
 	- Overfitting
 - V.loss can help adjust learning rate
 	- Or indicate when to stop training
 ![[under-over-fitting.png]]
--- a/Networks/CNN/Convolutional
+++ b/Networks/CNN/Convolutional
@ -0,0 +1,25 @@
 ## Design Parameters
 - Size of input image
 	- 256 x 256 x 1
 	- Towards top end of supportable
 - Padding
 	- Thickness of border 0s
 - Kernel size
 	- 7 x 7 x 1 x n
 		- N is for multiple filters per layer
 	- Main design decision
 		- 12 x 12/15 x 15 in early layers
 		- Lower in later filters
 		- Dataset-dependent
 - Stride
 	- Interval to sample
 	- 1
 		- Every subsequent pixel
 		- Same size out as in
 	- 2
 		- Every other subsequent pixel
 		- Out image is half input size
 - Size of computable output
 	- 252 x 252 x 1 x n
 		- Depends on padding and striding
--- a/Networks/CNN/Examples.md
+++ b/Networks/CNN/Examples.md
@ -0,0 +1,43 @@
 # LeNet
 - 1990's
 ![[lenet-1989.png]]
 - 1989
 ![[lenet-1998.png]]
 - 1998
 # AlexNet
 2012
 - [[Activation Functions#ReLu|ReLu]]
 - Normalisation
 ![[alexnet.png]]
 # VGG
 2015
 - 16 layers over AlexNet's 8
 - Looking at vanishing gradient problem
 	- Xavier
 - Similar kernel size throughout
 - Gradual filter increase
 ![[vgg-spec.png]]
 ![[vgg-arch.png]]
 # GoogLeNet
 2015
 - [[Inception Layer]]s
 - Multiple Loss Functions
 ![[googlenet.png]]
 ## [[Inception Layer]]
 ![[googlenet-inception.png]]
 ## Auxiliary Loss Functions
 - Two other SoftMax blocks
 - Help train really deep network
 	- Vanishing gradient problem
 ![[googlenet-auxilliary-loss.png]]
--- a/Networks/CNN/GAN/CycleGAN.md
+++ b/Networks/CNN/GAN/CycleGAN.md
@ -0,0 +1,22 @@
 Cycle Consistent GAN
 - G
 	- $x \rightarrow y$
 - F
 	- $y \rightarrow x$
 - Aims to bridge gap across domains
 	- Zebras-horses
 	- Audi-BMW
 - Learn bidirectional mapping function
 - Transitivity regularises training
 - $x \rightarrow y'$
 	- $y' \rightarrow x''$
 		- $x == x''$
 		- Cycle consistency
 - Requires two datasets
 	- One for each domain
 	- Not directly paired
 		- Unlike edge map $\rightarrow$ bag
 ![[cyclegan.png]]
 ![[cyclegan-results.png]]
--- a/Networks/CNN/GAN/DC-GAN.md
+++ b/Networks/CNN/GAN/DC-GAN.md
@ -0,0 +1,69 @@
 Deep Convolutional GAN
 ![[dc-gan.png]]
 - Generator
 	- FCN
 	- Decoder
 	- Generate image from code
 		- Low-dimensional
 			- ~100-D
 	- Reshape to tensor
 		- Upconv to image
 	- Train using Gaussian random noise for code
 - Discriminator
 	- Contractive
 	- Cross-entropy loss
 	- Conv and leaky [[Activation Functions#ReLu|ReLu]] layers only
 	- Normalised output via sigmoid
 ## Loss
 $$D(S,L)=-\sum_iL_ilog(S_i)$$
 - $S$
 	- $(0.1, 0.9)^T$
 	- Score generated by discriminator
 - $L$
 	- $(1, 0)^T$
 	- One-hot label vector
 	- Step 1
 		- Depends on choice of real/fake
 	- Step 2
 		- One-hot fake vector
 - $\sum_i$
 	- Sum over all images in mini-batch
 | Noise | Image |
 | ----- | ----- |
 | $z$   | $x$   | 
 - Generator wants
 	- $D(G(z))=1$
 	- Wants to fool discriminator
 - Discriminator wants
 	- $D(G(z))=0$
 	- Wants to correctly catch generator
 - Real data wants
 	- $D(x)=1$
 $$J^{(D)}=-\frac 1 2 \mathbb E_{x\sim p_{data}}\log D(x)-\frac 1 2 \mathbb E_z\log (1-D(G(z)))$$
 $$J^{(G)}=-J^{(D)}$$
 - First term for real images
 - Second term for fake images
 # Mode Collapse
 - Generator gives easy solution
 - Learns one image for most noise that will fool discriminator
 - Mitigate by minibatch discriminator
 	- Match G(z) distribution to x
 # What is Learnt?
 - Encoding texture/patch detail from training set
 	- Similar to FCN
 	- Reproducing texture at high level
 	- Cues triggered by code vector
 		- Input random noise
 - Iteratively improves visual feasibility
 	- Different to FCN
 - Discriminator is a task specific classifier
 - Difficult to train over diverse footage
 	- Mixing concepts doesn't work
 	- Single category/class
--- a/Networks/CNN/GAN/GAN.md
+++ b/Networks/CNN/GAN/GAN.md
@ -0,0 +1,31 @@
 # Fully Convolutional
 - Remove max-pooling
 	- Use strided upconv
 - Remove FC layers
 	- Hurts convergence in non-classification
 - Normalisation tricks
 	- Batch normalisation
 		- Batches of 0 mean and variance 1
 	- Leaky ReLu
 # Stages
 ## Generator, G
 - Synthesise 'fake' images
 - From noise
 ## Discriminator, D
 - Discriminator is a classifier
 	- Is image fake or real
 ![[gan-arch.png]]
 ![[gan-arch2.png]]
 ![[gan-results.png]]
 # Training
 ![[gan-training-discriminator.png]]
 ![[gan-training-generator.png]]
 # Code Vector Math for Control
 ![[cvmfc.png]]
 - Do AM to derive code for an image
 ![[code-vector-math-for-control-results.png]]
--- a/Networks/CNN/GAN/StackGAN.md
+++ b/Networks/CNN/GAN/StackGAN.md
@ -0,0 +1,6 @@
 - Feed output from synthesis into up-res network
 - Generate standard low-res image
 	- Feed into [[cGAN]]
 ![[stackgan.png]]
 ![[stackgan-results.png]]
--- a/Networks/CNN/GAN/cGAN.md
+++ b/Networks/CNN/GAN/cGAN.md
@ -0,0 +1,23 @@
 Conditional GAN
 - Hard to control with AM
 	- Unconditional GAN
 - Condition synthesis on a class label
 - Concatenate unconditional code with conditioning vector
 	- Label
 - No longer unsupervised
 	- Everything labelled
 		- Fake images and dataset
 	- **Requires pairing**
 ![[cgan.png]]
 ![[cgan-example.png]]
 # Image Conditioning Vector
 ![[icv-pos-neg-examples.png]]
 ![[icv-results.png]]
 # Text Encoding
 - word2vec
 ![[word2vec.png]]
--- a/Networks/CNN/Inception
+++ b/Networks/CNN/Inception
@ -0,0 +1,14 @@
 - Similar to band-pass pyramid
 - Changes fixed scale window sizes
 	- Couple of different scales
 	- Concatenate results
 ![[inception-layer-effect.png]]
 ![[inception-layer-arch.png]]
 - 1 x 1
 	- Averages over channels
 	- Bottleneck layer
 		- Reduces computation
 			- x 10
 		- Shrinks number of filters
--- a/Networks/CNN/Max
+++ b/Networks/CNN/Max
@ -0,0 +1,26 @@
 - Maximum within window and writes result to output
 	- Downsamples image
 	- More non-linearity
 - Doesn't remove important information
 	- Max value is the good bit
 - No parameters
 ![[max-pooling.png]]
 ## Design Parameters
 - Size of input image
 	- 252 x 252 x 1 x n
 - Padding
 - Kernel size
 	- 3 x 3 x 1
 	- Doesn't need to be odd
 		- 2 x 2
 - Stride
 	- Typically n
 		- For n x n kernel size
 	- Sometimes 4 x 4 in early layers
 		- 16 times less data
 			- Rapid downsample
 - Size of computable output
 	- 250 x 250 x 1 x n
 		- Depends on padding and striding
--- a/Networks/CNN/Normalisation.md
+++ b/Networks/CNN/Normalisation.md
@ -0,0 +1,5 @@
 - To keep sensible layer by layer
 - Apply kernel to same location of all channels
 	- Pixels in window divided by sum of pixel within volume across channels
 ![[cnn-normalisation.png]]
--- a/Manipulations.md
+++ b/Manipulations.md
@ -0,0 +1,11 @@
 # Augmentation
 - Mimic larger datasets
 - Help with over-fitting
 ![[data-aug.png]]
 # Data Whitening
 - Remove average image of dataset
 	- Or average RGB pixel from all
 ![[data-whitening.png]]
--- a/Networks/CV/Datasets.md
+++ b/Networks/CV/Datasets.md
@ -0,0 +1,23 @@
 # MNIST
 - 70,000 hand-drawn characters from US mail
 - 28x28 images
 - 10 classes (0 through 9)
 - Achieved 99.83%
 	- Ciresan et al. 2011
 # CIFAR-10
 - 60,000 colour images
 - 32x32 images
 - 10 classes
 	- Airplane
 	- Automobile
 	- Bird
 	- Cat
 	- Deer
 	- Dog
 	- Frog
 	- Horse
 	- Ship
 	- Truck
 - Achieved 90.7%
 	- Wan et al. 2013
--- a/Networks/CV/Filters.md
+++ b/Networks/CV/Filters.md
@ -0,0 +1,2 @@
 # Gabor
 ![[gabor.png]]
--- a/Networks/CV/Layer
+++ b/Networks/CV/Layer
@ -0,0 +1 @@
 ![[cnn-cv-layer-arch.png]]
--- a/Networks/Weight
+++ b/Networks/Weight
@ -0,0 +1,18 @@
 - Randomly
 	- Gaussian noise with mean = 0
 	- Small network
 		- Fixed sigma is fine
 			- 0.01
 		- E.g. 8 layers
 			- AlexNet
 	- Too large
 		- Wont converge
 	- Too small
 		- Gradient wont propagate back many layers
 ## Xavier System
 $$\sigma=\frac 1 {n_{in}+n_{out}}$$
 or
 $$\sigma=\sqrt{2/n}$$
 * Where $n=\text{filter size}\times n_{out}$
 * And $n_{in}$ and $n_{out}$ refer to number of image channels in and out of the layer
--- a/img/alexnet.png
+++ b/img/alexnet.png
--- a/img/cgan-example.png
+++ b/img/cgan-example.png
--- a/img/cgan.png
+++ b/img/cgan.png
--- a/img/cnn-cv-layer-arch.png
+++ b/img/cnn-cv-layer-arch.png
--- a/img/cnn-descriptor.png
+++ b/img/cnn-descriptor.png
--- a/img/cnn-normalisation.png
+++ b/img/cnn-normalisation.png
--- a/img/code-vector-math-for-control-results.png
+++ b/img/code-vector-math-for-control-results.png
--- a/img/cvmfc.png
+++ b/img/cvmfc.png
--- a/img/cyclegan-results.png
+++ b/img/cyclegan-results.png
--- a/img/cyclegan.png
+++ b/img/cyclegan.png
--- a/img/data-aug.png
+++ b/img/data-aug.png
--- a/img/data-whitening.png
+++ b/img/data-whitening.png
--- a/img/dc-gan.png
+++ b/img/dc-gan.png
--- a/img/fine-tuning-freezing.png
+++ b/img/fine-tuning-freezing.png
--- a/img/gabor.png
+++ b/img/gabor.png
--- a/img/gan-arch.png
+++ b/img/gan-arch.png
--- a/img/gan-arch2.png
+++ b/img/gan-arch2.png
--- a/img/gan-results.png
+++ b/img/gan-results.png
--- a/img/gan-training-discriminator.png
+++ b/img/gan-training-discriminator.png
--- a/img/gan-training-generator.png
+++ b/img/gan-training-generator.png
--- a/img/googlenet-auxilliary-loss.png
+++ b/img/googlenet-auxilliary-loss.png
--- a/img/googlenet-inception.png
+++ b/img/googlenet-inception.png
--- a/img/googlenet.png
+++ b/img/googlenet.png
--- a/img/icv-pos-neg-examples.png
+++ b/img/icv-pos-neg-examples.png
--- a/img/icv-results.png
+++ b/img/icv-results.png
--- a/img/inception-layer-arch.png
+++ b/img/inception-layer-arch.png
--- a/img/inception-layer-effect.png
+++ b/img/inception-layer-effect.png
--- a/img/lenet-1989.png
+++ b/img/lenet-1989.png
--- a/img/lenet-1998.png
+++ b/img/lenet-1998.png
--- a/img/max-pooling.png
+++ b/img/max-pooling.png
--- a/img/stackgan-results.png
+++ b/img/stackgan-results.png
--- a/img/stackgan.png
+++ b/img/stackgan.png
--- a/img/under-over-fitting.png
+++ b/img/under-over-fitting.png
--- a/img/vgg-arch.png
+++ b/img/vgg-arch.png
--- a/img/vgg-spec.png
+++ b/img/vgg-spec.png
--- a/img/word2vec.png
+++ b/img/word2vec.png