vault backup: 2023-05-26 18:29:17

Affected files: .obsidian/graph.json .obsidian/workspace-mobile.json .obsidian/workspace.json STEM/AI/Neural Networks/Activation Functions.md STEM/AI/Neural Networks/CNN/CNN.md STEM/AI/Neural Networks/CNN/Convolutional Layer.md STEM/AI/Neural Networks/CNN/Examples.md STEM/AI/Neural Networks/CNN/GAN/CycleGAN.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/GAN/StackGAN.md STEM/AI/Neural Networks/CNN/GAN/cGAN.md STEM/AI/Neural Networks/CNN/Inception Layer.md STEM/AI/Neural Networks/CNN/Max Pooling.md STEM/AI/Neural Networks/CNN/Normalisation.md STEM/AI/Neural Networks/CV/Data Manipulations.md STEM/AI/Neural Networks/CV/Datasets.md STEM/AI/Neural Networks/CV/Filters.md STEM/AI/Neural Networks/CV/Layer Structure.md STEM/AI/Neural Networks/Weight Init.md STEM/img/alexnet.png STEM/img/cgan-example.png STEM/img/cgan.png STEM/img/cnn-cv-layer-arch.png STEM/img/cnn-descriptor.png STEM/img/cnn-normalisation.png STEM/img/code-vector-math-for-control-results.png STEM/img/cvmfc.png STEM/img/cyclegan-results.png STEM/img/cyclegan.png STEM/img/data-aug.png STEM/img/data-whitening.png STEM/img/dc-gan.png STEM/img/fine-tuning-freezing.png STEM/img/gabor.png STEM/img/gan-arch.png STEM/img/gan-arch2.png STEM/img/gan-results.png STEM/img/gan-training-discriminator.png STEM/img/gan-training-generator.png STEM/img/googlenet-auxilliary-loss.png STEM/img/googlenet-inception.png STEM/img/googlenet.png STEM/img/icv-pos-neg-examples.png STEM/img/icv-results.png STEM/img/inception-layer-arch.png STEM/img/inception-layer-effect.png STEM/img/lenet-1989.png STEM/img/lenet-1998.png STEM/img/max-pooling.png STEM/img/stackgan-results.png STEM/img/stackgan.png STEM/img/under-over-fitting.png STEM/img/vgg-arch.png STEM/img/vgg-spec.png STEM/img/word2vec.png
2023-05-26 18:29:17 +01:00 · 2023-05-26 18:29:17 +01:00 · 8f0b604256
commit 8f0b604256
parent 5a592c8c7c
53 changed files with 385 additions and 0 deletions
--- a/Networks/Activation
+++ b/Networks/Activation
@ -52,5 +52,17 @@ $$\frac{dy}{dx}=
 Rectilinear
 - For deep networks
 - $y=max(0,x)$
+- CNNs
+	- Breaks associativity of successive convolutions
+		- Critical for learning complex functions
+	- Sometimes small scalar for negative
+		- Leaky ReLu

 ![[relu.png]]
+
+# SoftMax
+- Output is per-class vector of likelihoods
+	- Should be normalised into probability vector
+
+## AlexNet
+$$f(x_i)=\frac{\text{exp}(x_i)}{\sum_{j=1}^{1000}\text{exp}(x_j)}$$
--- a/Networks/CNN/CNN.md
+++ b/Networks/CNN/CNN.md
@ -0,0 +1,54 @@
+## Before 2010s
+- Data hungry
+	- Need lots of training data
+- Processing power
+- Niche
+	- No-one cared/knew about CNNs
+## After
+- ImageNet
+	- 16m images, 1000 classes
+- GPUs
+	- General processing GPUs
+	- CUDA
+- NIPS/ECCV 2012
+	- Double digit % gain on ImageNet accuracy
+
+# Full Connected
+Dense
+- Move from convolutional operations towards vector output
+- Stochastic drop-out
+	- Sub-sample channels and only connect some to dense layers
+
+# As a Descriptor
+- Most powerful as a deeply learned feature extractor
+- Dense classifier at the end isn't fantastic
+	- Use SVM to classify prior to penultimate layer
+
+![[cnn-descriptor.png]]
+
+# Finetuning
+- Observations
+	- Most CNNs have similar weights in conv1
+	- Most useful CNNs have several conv layers
+		- Many weights
+		- Lots of training data
+	- Training data is hard to get
+		- Labelling
+- Reuse weights from other network
+- Freeze weights in first 3-5 conv layers
+	- Learning rate = 0
+	- Randomly initialise remaining layers
+	- Continue with existing weights
+
+![[fine-tuning-freezing.png]]
+# Training
+- Validation & training loss
+- Early
+	- Under-fitting
+	- Training not representative
+- Later
+	- Overfitting
+- V.loss can help adjust learning rate
+	- Or indicate when to stop training
+
+![[under-over-fitting.png]]
--- a/Networks/CNN/Convolutional
+++ b/Networks/CNN/Convolutional
@ -0,0 +1,25 @@
+
+## Design Parameters
+- Size of input image
+	- 256 x 256 x 1
+	- Towards top end of supportable
+- Padding
+	- Thickness of border 0s
+- Kernel size
+	- 7 x 7 x 1 x n
+		- N is for multiple filters per layer
+	- Main design decision
+		- 12 x 12/15 x 15 in early layers
+		- Lower in later filters
+		- Dataset-dependent
+- Stride
+	- Interval to sample
+	- 1
+		- Every subsequent pixel
+		- Same size out as in
+	- 2
+		- Every other subsequent pixel
+		- Out image is half input size
+- Size of computable output
+	- 252 x 252 x 1 x n
+		- Depends on padding and striding
--- a/Networks/CNN/Examples.md
+++ b/Networks/CNN/Examples.md
@ -0,0 +1,43 @@
+# LeNet
+- 1990's
+![[lenet-1989.png]]
+- 1989
+![[lenet-1998.png]]
+- 1998
+
+# AlexNet
+2012
+
+- [[Activation Functions#ReLu|ReLu]]
+- Normalisation
+
+![[alexnet.png]]
+
+# VGG
+2015
+
+- 16 layers over AlexNet's 8
+- Looking at vanishing gradient problem
+	- Xavier
+- Similar kernel size throughout
+- Gradual filter increase
+
+![[vgg-spec.png]]
+![[vgg-arch.png]]
+
+# GoogLeNet
+2015
+
+- [[Inception Layer]]s
+- Multiple Loss Functions
+
+![[googlenet.png]]
+
+## [[Inception Layer]]
+![[googlenet-inception.png]]
+## Auxiliary Loss Functions
+- Two other SoftMax blocks
+- Help train really deep network
+	- Vanishing gradient problem
+
+![[googlenet-auxilliary-loss.png]]
--- a/Networks/CNN/GAN/CycleGAN.md
+++ b/Networks/CNN/GAN/CycleGAN.md
@ -0,0 +1,22 @@
+Cycle Consistent GAN
+
+- G
+	- $x \rightarrow y$
+- F
+	- $y \rightarrow x$
+- Aims to bridge gap across domains
+	- Zebras-horses
+	- Audi-BMW
+- Learn bidirectional mapping function
+- Transitivity regularises training
+- $x \rightarrow y'$
+	- $y' \rightarrow x''$
+		- $x == x''$
+		- Cycle consistency
+- Requires two datasets
+	- One for each domain
+	- Not directly paired
+		- Unlike edge map $\rightarrow$ bag
+
+![[cyclegan.png]]
+![[cyclegan-results.png]]
--- a/Networks/CNN/GAN/DC-GAN.md
+++ b/Networks/CNN/GAN/DC-GAN.md
@ -0,0 +1,69 @@
+Deep Convolutional GAN
+![[dc-gan.png]]
+
+- Generator
+	- FCN
+	- Decoder
+	- Generate image from code
+		- Low-dimensional
+			- ~100-D
+	- Reshape to tensor
+		- Upconv to image
+	- Train using Gaussian random noise for code
+- Discriminator
+	- Contractive
+	- Cross-entropy loss
+	- Conv and leaky [[Activation Functions#ReLu|ReLu]] layers only
+	- Normalised output via sigmoid
+
+## Loss
+$$D(S,L)=-\sum_iL_ilog(S_i)$$
+- $S$
+	- $(0.1, 0.9)^T$
+	- Score generated by discriminator
+- $L$
+	- $(1, 0)^T$
+	- One-hot label vector
+	- Step 1
+		- Depends on choice of real/fake
+	- Step 2
+		- One-hot fake vector
+- $\sum_i$
+	- Sum over all images in mini-batch
+
+| Noise | Image |
+| ----- | ----- |
+| $z$   | $x$   | 
+
+- Generator wants
+	- $D(G(z))=1$
+	- Wants to fool discriminator
+- Discriminator wants
+	- $D(G(z))=0$
+	- Wants to correctly catch generator
+- Real data wants
+	- $D(x)=1$
+
+$$J^{(D)}=-\frac 1 2 \mathbb E_{x\sim p_{data}}\log D(x)-\frac 1 2 \mathbb E_z\log (1-D(G(z)))$$
+$$J^{(G)}=-J^{(D)}$$
+- First term for real images
+- Second term for fake images
+
+# Mode Collapse
+- Generator gives easy solution
+- Learns one image for most noise that will fool discriminator
+- Mitigate by minibatch discriminator
+	- Match G(z) distribution to x
+
+# What is Learnt?
+- Encoding texture/patch detail from training set
+	- Similar to FCN
+	- Reproducing texture at high level
+	- Cues triggered by code vector
+		- Input random noise
+- Iteratively improves visual feasibility
+	- Different to FCN
+- Discriminator is a task specific classifier
+- Difficult to train over diverse footage
+	- Mixing concepts doesn't work
+	- Single category/class
--- a/Networks/CNN/GAN/GAN.md
+++ b/Networks/CNN/GAN/GAN.md
@ -0,0 +1,31 @@
+# Fully Convolutional
+- Remove max-pooling
+	- Use strided upconv
+- Remove FC layers
+	- Hurts convergence in non-classification
+- Normalisation tricks
+	- Batch normalisation
+		- Batches of 0 mean and variance 1
+	- Leaky ReLu
+
+# Stages
+## Generator, G
+- Synthesise 'fake' images
+- From noise
+## Discriminator, D
+- Discriminator is a classifier
+	- Is image fake or real
+
+![[gan-arch.png]]
+![[gan-arch2.png]]
+
+![[gan-results.png]]
+
+# Training
+![[gan-training-discriminator.png]]
+![[gan-training-generator.png]]
+
+# Code Vector Math for Control
+![[cvmfc.png]]
+- Do AM to derive code for an image
+![[code-vector-math-for-control-results.png]]
--- a/Networks/CNN/GAN/StackGAN.md
+++ b/Networks/CNN/GAN/StackGAN.md
@ -0,0 +1,6 @@
+- Feed output from synthesis into up-res network
+- Generate standard low-res image
+	- Feed into [[cGAN]]
+
+![[stackgan.png]]
+![[stackgan-results.png]]
--- a/Networks/CNN/GAN/cGAN.md
+++ b/Networks/CNN/GAN/cGAN.md
@ -0,0 +1,23 @@
+Conditional GAN
+
+- Hard to control with AM
+	- Unconditional GAN
+- Condition synthesis on a class label
+- Concatenate unconditional code with conditioning vector
+	- Label
+- No longer unsupervised
+	- Everything labelled
+		- Fake images and dataset
+	- **Requires pairing**
+
+![[cgan.png]]
+![[cgan-example.png]]
+
+# Image Conditioning Vector
+![[icv-pos-neg-examples.png]]
+![[icv-results.png]]
+
+# Text Encoding
+- word2vec
+
+![[word2vec.png]]
--- a/Networks/CNN/Inception
+++ b/Networks/CNN/Inception
@ -0,0 +1,14 @@
+- Similar to band-pass pyramid
+- Changes fixed scale window sizes
+	- Couple of different scales
+	- Concatenate results
+
+![[inception-layer-effect.png]]
+![[inception-layer-arch.png]]
+
+- 1 x 1
+	- Averages over channels
+	- Bottleneck layer
+		- Reduces computation
+			- x 10
+		- Shrinks number of filters
--- a/Networks/CNN/Max
+++ b/Networks/CNN/Max
@ -0,0 +1,26 @@
+- Maximum within window and writes result to output
+	- Downsamples image
+	- More non-linearity
+- Doesn't remove important information
+	- Max value is the good bit
+- No parameters
+
+![[max-pooling.png]]
+
+## Design Parameters
+- Size of input image
+	- 252 x 252 x 1 x n
+- Padding
+- Kernel size
+	- 3 x 3 x 1
+	- Doesn't need to be odd
+		- 2 x 2
+- Stride
+	- Typically n
+		- For n x n kernel size
+	- Sometimes 4 x 4 in early layers
+		- 16 times less data
+			- Rapid downsample
+- Size of computable output
+	- 250 x 250 x 1 x n
+		- Depends on padding and striding
--- a/Networks/CNN/Normalisation.md
+++ b/Networks/CNN/Normalisation.md
@ -0,0 +1,5 @@
+- To keep sensible layer by layer
+- Apply kernel to same location of all channels
+	- Pixels in window divided by sum of pixel within volume across channels
+
+![[cnn-normalisation.png]]
--- a/Manipulations.md
+++ b/Manipulations.md
@ -0,0 +1,11 @@
+# Augmentation
+- Mimic larger datasets
+- Help with over-fitting
+
+![[data-aug.png]]
+
+# Data Whitening
+- Remove average image of dataset
+	- Or average RGB pixel from all
+
+![[data-whitening.png]]
--- a/Networks/CV/Datasets.md
+++ b/Networks/CV/Datasets.md
@ -0,0 +1,23 @@
+# MNIST
+- 70,000 hand-drawn characters from US mail
+- 28x28 images
+- 10 classes (0 through 9)
+- Achieved 99.83%
+	- Ciresan et al. 2011
+
+# CIFAR-10
+- 60,000 colour images
+- 32x32 images
+- 10 classes
+	- Airplane
+	- Automobile
+	- Bird
+	- Cat
+	- Deer
+	- Dog
+	- Frog
+	- Horse
+	- Ship
+	- Truck
+- Achieved 90.7%
+	- Wan et al. 2013
--- a/Networks/CV/Filters.md
+++ b/Networks/CV/Filters.md
@ -0,0 +1,2 @@
+# Gabor
+![[gabor.png]]
--- a/Networks/CV/Layer
+++ b/Networks/CV/Layer
@ -0,0 +1 @@
+![[cnn-cv-layer-arch.png]]
--- a/Networks/Weight
+++ b/Networks/Weight
@ -0,0 +1,18 @@
+- Randomly
+	- Gaussian noise with mean = 0
+	- Small network
+		- Fixed sigma is fine
+			- 0.01
+		- E.g. 8 layers
+			- AlexNet
+	- Too large
+		- Wont converge
+	- Too small
+		- Gradient wont propagate back many layers
+
+## Xavier System
+$$\sigma=\frac 1 {n_{in}+n_{out}}$$
+or
+$$\sigma=\sqrt{2/n}$$
+* Where $n=\text{filter size}\times n_{out}$
+* And $n_{in}$ and $n_{out}$ refer to number of image channels in and out of the layer
--- a/img/alexnet.png
+++ b/img/alexnet.png
--- a/img/cgan-example.png
+++ b/img/cgan-example.png
--- a/img/cgan.png
+++ b/img/cgan.png
--- a/img/cnn-cv-layer-arch.png
+++ b/img/cnn-cv-layer-arch.png
--- a/img/cnn-descriptor.png
+++ b/img/cnn-descriptor.png
--- a/img/cnn-normalisation.png
+++ b/img/cnn-normalisation.png
--- a/img/code-vector-math-for-control-results.png
+++ b/img/code-vector-math-for-control-results.png
--- a/img/cvmfc.png
+++ b/img/cvmfc.png
--- a/img/cyclegan-results.png
+++ b/img/cyclegan-results.png
--- a/img/cyclegan.png
+++ b/img/cyclegan.png
--- a/img/data-aug.png
+++ b/img/data-aug.png
--- a/img/data-whitening.png
+++ b/img/data-whitening.png
--- a/img/dc-gan.png
+++ b/img/dc-gan.png
--- a/img/fine-tuning-freezing.png
+++ b/img/fine-tuning-freezing.png
--- a/img/gabor.png
+++ b/img/gabor.png
--- a/img/gan-arch.png
+++ b/img/gan-arch.png
--- a/img/gan-arch2.png
+++ b/img/gan-arch2.png
--- a/img/gan-results.png
+++ b/img/gan-results.png
--- a/img/gan-training-discriminator.png
+++ b/img/gan-training-discriminator.png
--- a/img/gan-training-generator.png
+++ b/img/gan-training-generator.png
--- a/img/googlenet-auxilliary-loss.png
+++ b/img/googlenet-auxilliary-loss.png
--- a/img/googlenet-inception.png
+++ b/img/googlenet-inception.png
--- a/img/googlenet.png
+++ b/img/googlenet.png
--- a/img/icv-pos-neg-examples.png
+++ b/img/icv-pos-neg-examples.png
--- a/img/icv-results.png
+++ b/img/icv-results.png
--- a/img/inception-layer-arch.png
+++ b/img/inception-layer-arch.png
--- a/img/inception-layer-effect.png
+++ b/img/inception-layer-effect.png
--- a/img/lenet-1989.png
+++ b/img/lenet-1989.png
--- a/img/lenet-1998.png
+++ b/img/lenet-1998.png
--- a/img/max-pooling.png
+++ b/img/max-pooling.png
--- a/img/stackgan-results.png
+++ b/img/stackgan-results.png
--- a/img/stackgan.png
+++ b/img/stackgan.png
--- a/img/under-over-fitting.png
+++ b/img/under-over-fitting.png
--- a/img/vgg-arch.png
+++ b/img/vgg-arch.png
--- a/img/vgg-spec.png
+++ b/img/vgg-spec.png
--- a/img/word2vec.png
+++ b/img/word2vec.png