vault backup: 2023-06-06 11:48:49
Affected files: STEM/AI/Neural Networks/CNN/Examples.md STEM/AI/Neural Networks/CNN/FCN/FCN.md STEM/AI/Neural Networks/CNN/FCN/ResNet.md STEM/AI/Neural Networks/CNN/FCN/Skip Connections.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/Interpretation.md STEM/AI/Neural Networks/CNN/UpConv.md STEM/AI/Neural Networks/Deep Learning.md STEM/AI/Neural Networks/MLP/MLP.md STEM/AI/Neural Networks/Properties+Capabilities.md STEM/AI/Neural Networks/SLP/Least Mean Square.md STEM/AI/Neural Networks/SLP/SLP.md STEM/AI/Neural Networks/Transformers/Transformers.md STEM/AI/Properties.md STEM/CS/Language Binding.md STEM/CS/Languages/dotNet.md STEM/Signal Proc/Image/Image Processing.md
This commit is contained in:
parent
d7ab8f329a
commit
7bc4dffd8b
@ -8,7 +8,7 @@
|
||||
# AlexNet
|
||||
2012
|
||||
|
||||
- [[Activation Functions#ReLu|ReLu]]
|
||||
- [ReLu](../Activation%20Functions.md#ReLu)
|
||||
- Normalisation
|
||||
|
||||
![alexnet](../../../img/alexnet.png)
|
||||
@ -29,13 +29,13 @@
|
||||
2015
|
||||
|
||||
- [Inception Layer](Inception%20Layer.md)s
|
||||
- Multiple [[Deep Learning#Loss Function|Loss]] Functions
|
||||
- Multiple [Loss](../Deep%20Learning.md#Loss%20Function) Functions
|
||||
|
||||
![googlenet](../../../img/googlenet.png)
|
||||
|
||||
## [Inception Layer](Inception%20Layer.md)
|
||||
![googlenet-inception](../../../img/googlenet-inception.png)
|
||||
## Auxiliary [[Deep Learning#Loss Function|Loss]] Functions
|
||||
## Auxiliary [Loss](../Deep%20Learning.md#Loss%20Function) Functions
|
||||
- Two other SoftMax blocks
|
||||
- Help train really deep network
|
||||
- Vanishing gradient problem
|
||||
|
@ -1,6 +1,6 @@
|
||||
Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al Network
|
||||
|
||||
[[Convolutional Layer|Convolutional]] and [[UpConv|up-convolutional layers]] with [[Activation Functions#ReLu|ReLu]] but no others (pooling)
|
||||
[Convolutional](../Convolutional%20Layer.md) and [up-convolutional layers](../UpConv.md) with [ReLu](../../Activation%20Functions.md#ReLu) but no others (pooling)
|
||||
- All some sort of Encoder-Decoder
|
||||
|
||||
Contractive → [UpConv](../UpConv.md)
|
||||
@ -19,13 +19,13 @@ Contractive → [UpConv](../UpConv.md)
|
||||
- Rarely from scratch
|
||||
- Pre-trained weights
|
||||
- Replace final layers
|
||||
- [[MLP|FC]] layers
|
||||
- [FC](../../MLP/MLP.md) layers
|
||||
- White-noise initialised
|
||||
- Add [UpConv](../UpConv.md) layer(s)
|
||||
- Fine-tune train
|
||||
- Freeze others
|
||||
- Annotated GT images
|
||||
- Can use summed per-pixel log [[Deep Learning#Loss Function|loss]]
|
||||
- Can use summed per-pixel log [loss](../../Deep%20Learning.md#Loss%20Function)
|
||||
|
||||
# Evaluation
|
||||
![fcn-eval](../../../../img/fcn-eval.png)
|
||||
|
@ -12,18 +12,18 @@
|
||||
|
||||
# Design
|
||||
|
||||
- Skips across pairs of [[Convolutional Layer|conv layers]]
|
||||
- Skips across pairs of [conv layers](../Convolutional%20Layer.md)
|
||||
- Elementwise addition
|
||||
- All layer 3x3 kernel
|
||||
- Spatial size halves each layer
|
||||
- Filters doubles each layer
|
||||
- [[FCN|Fully convolutional]]
|
||||
- [Fully convolutional](FCN.md)
|
||||
- No fc layer
|
||||
- No [[Max Pooling|pooling]]
|
||||
- No [pooling](../Max%20Pooling.md)
|
||||
- Except at end
|
||||
- No dropout
|
||||
|
||||
[[Datasets#ImageNet|ImageNet]] Error:
|
||||
[ImageNet](../../CV/Datasets.md#ImageNet) Error:
|
||||
![imagenet-error](../../../../img/imagenet-error.png)
|
||||
|
||||
![resnet-arch](../../../../img/resnet-arch.png)
|
||||
|
@ -1,4 +1,4 @@
|
||||
- Output of [[Convolutional Layer|conv]], c, layers are added to inputs of [UpConv](../UpConv.md), d, layers
|
||||
- Output of [conv](../Convolutional%20Layer.md), c, layers are added to inputs of [UpConv](../UpConv.md), d, layers
|
||||
- Element-wise, not channel appending
|
||||
- Propagate high frequency information to later layers
|
||||
- Two types
|
||||
|
@ -13,10 +13,10 @@ Deep [Convolutional](../../../../Signal%20Proc/Convolution.md) [GAN](GAN.md)
|
||||
- Discriminator
|
||||
- Contractive
|
||||
- Cross-entropy [loss](../../Deep%20Learning.md#Loss%20Function)
|
||||
- [Conv](../Convolutional%20Layer.md) and leaky [[Activation Functions#ReLu|ReLu]] layers only
|
||||
- Normalised output via [[Activation Functions#Sigmoid|sigmoid]]
|
||||
- [Conv](../Convolutional%20Layer.md) and leaky [ReLu](../../Activation%20Functions.md#ReLu) layers only
|
||||
- Normalised output via [sigmoid](../../Activation%20Functions.md#Sigmoid)
|
||||
|
||||
## [[Deep Learning#Loss Function|Loss]]
|
||||
## [Loss](../../Deep%20Learning.md#Loss%20Function)
|
||||
$$D(S,L)=-\sum_iL_ilog(S_i)$$
|
||||
- $S$
|
||||
- $(0.1, 0.9)^T$
|
||||
|
@ -1,12 +1,12 @@
|
||||
# Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al
|
||||
- Remove [Max Pooling](../Max%20Pooling.md)
|
||||
- Use strided [UpConv](../UpConv.md)
|
||||
- Remove [[MLP|FC]] layers
|
||||
- Remove [FC](../../MLP/MLP.md) layers
|
||||
- Hurts convergence in non-classification
|
||||
- Normalisation tricks
|
||||
- Batch normalisation
|
||||
- Batches of 0 mean and variance 1
|
||||
- Leaky [[Activation Functions#ReLu|ReLu]]
|
||||
- Leaky [ReLu](../../Activation%20Functions.md#ReLu)
|
||||
|
||||
# Stages
|
||||
## Generator, G
|
||||
@ -27,5 +27,5 @@
|
||||
|
||||
# Code Vector Math for Control
|
||||
![cvmfc](../../../../img/cvmfc.png)
|
||||
- Do [[Interpretation#Activation Maximisation|AM]] to derive code for an image
|
||||
- Do [AM](../Interpretation.md#Activation%20Maximisation) to derive code for an image
|
||||
![code-vector-math-for-control-results](../../../../img/code-vector-math-for-control-results.png)
|
@ -1,13 +1,13 @@
|
||||
# Activation Maximisation
|
||||
- Synthesise an ideal image for a class
|
||||
- Maximise 1-hot output
|
||||
- Maximise [[Activation Functions#SoftMax|SoftMax]]
|
||||
- Maximise [SoftMax](../Activation%20Functions.md#SoftMax)
|
||||
|
||||
![am](../../../img/am.png)
|
||||
- **Use trained network**
|
||||
- Don't update weights
|
||||
- [[Architectures|Feedforward]] noise
|
||||
- [[Back-Propagation|Back-propagate]] [[Deep Learning#Loss Function|loss]]
|
||||
- [Feedforward](../Architectures.md) noise
|
||||
- [Back-propagate](../MLP/Back-Propagation.md) [loss](../Deep%20Learning.md#Loss%20Function)
|
||||
- Don't update weights
|
||||
- Update image
|
||||
|
||||
@ -17,7 +17,7 @@
|
||||
- Prone to high frequency noise
|
||||
- Minimise
|
||||
- Total variation
|
||||
- $x^*$ is the best solution to minimise [[Deep Learning#Loss Function|loss]]
|
||||
- $x^*$ is the best solution to minimise [loss](../Deep%20Learning.md#Loss%20Function)
|
||||
|
||||
$$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_0)$$
|
||||
- Won't work
|
||||
|
@ -9,7 +9,7 @@
|
||||
- Could specify kernel
|
||||
- Or learn
|
||||
- Can have multiple upconv layers
|
||||
- Separated by [[Activation Functions#ReLu|ReLu]]
|
||||
- Separated by [ReLu](../Activation%20Functions.md#ReLu)
|
||||
- For non-linear up-sampling conv
|
||||
- Interpolation is linear
|
||||
|
||||
|
@ -8,7 +8,7 @@ Objective Function
|
||||
![deep-loss-function](../../img/deep-loss-function.png)
|
||||
|
||||
- Test accuracy worse than train accuracy = overfitting
|
||||
- [[MLP|Dense]] = [[MLP|fully connected]]
|
||||
- [Dense](MLP/MLP.md) = [fully connected](MLP/MLP.md)
|
||||
- Automates feature engineering
|
||||
|
||||
![ml-dl](../../img/ml-dl.png)
|
||||
|
@ -1,15 +1,15 @@
|
||||
- [[Architectures|Feedforward]]
|
||||
- [Feedforward](../Architectures.md)
|
||||
- Single hidden layer can learn any function
|
||||
- Universal approximation theorem
|
||||
- Each hidden layer can operate as a different feature extraction layer
|
||||
- Lots of [[Weight Init|weights]] to learn
|
||||
- Lots of [weights](../Weight%20Init.md) to learn
|
||||
- [Back-Propagation](Back-Propagation.md) is supervised
|
||||
|
||||
![mlp-arch](../../../img/mlp-arch.png)
|
||||
|
||||
# Universal Approximation Theory
|
||||
A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory approximate any mathematical function
|
||||
- In practice not trainable with [[Back-Propagation|BP]]
|
||||
A finite [feedforward](../Architectures.md) MLP with 1 hidden layer can in theory approximate any mathematical function
|
||||
- In practice not trainable with [BP](Back-Propagation.md)
|
||||
|
||||
![activation-function](../../../img/activation-function.png)
|
||||
![mlp-arch-diagram](../../../img/mlp-arch-diagram.png)
|
||||
@ -19,4 +19,4 @@ A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory app
|
||||
![tlu](../../../img/tlu.png)
|
||||
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
|
||||
![mlp-non-linear-decision](../../../img/mlp-non-linear-decision.png)
|
||||
- Can generate a non-linear [[Decision Boundary|decision boundary]]
|
||||
- Can generate a non-linear [decision boundary](Decision%20Boundary.md)
|
@ -45,7 +45,7 @@
|
||||
- Confidence value
|
||||
|
||||
# Contextual Information
|
||||
- [[Neural Networks#Knowledge|Knowledge]] represented by structure and activation weight
|
||||
- [Knowledge](Neural%20Networks.md#Knowledge) represented by structure and activation weight
|
||||
- Any neuron can be affected by global activity
|
||||
- Contextual information handled naturally
|
||||
|
||||
|
@ -20,7 +20,7 @@ $$\frac{\partial \mathfrak{E}(w)}{\partial w(n)}=-x(n)\cdot e(n)$$
|
||||
$$\hat{g}(n)=-x(n)\cdot e(n)$$
|
||||
$$\hat{w}(n+1)=\hat{w}(n)+\eta \cdot x(n) \cdot e(n)$$
|
||||
|
||||
- Above is a [[Architectures|feedforward]] loop around weight vector, $\hat{w}$
|
||||
- Above is a [feedforward](../Architectures.md) loop around weight vector, $\hat{w}$
|
||||
- Behaves like low-pass filter
|
||||
- Pass low frequency components of error signal
|
||||
- Average time constant of filtering action inversely proportional to learning-rate
|
||||
|
@ -4,4 +4,4 @@ $$=w^T(n)x(n)$$
|
||||
![slp-hyperplane](../../../img/slp-hyperplane.png)
|
||||
Perceptron learning is performed for a finite number of iteration and then stops
|
||||
|
||||
[[Least Mean Square|LMS]] is continuous learning that doesn't stop
|
||||
[LMS](Least%20Mean%20Square.md) is continuous learning that doesn't stop
|
@ -1,4 +1,4 @@
|
||||
- [[Attention|Self-attention]]
|
||||
- [Self-attention](Attention.md)
|
||||
- Weighting significance of parts of the input
|
||||
- Including recursive output
|
||||
- Similar to [RNN](../RNN/RNN.md)s
|
||||
@ -35,5 +35,5 @@
|
||||
- Uses incorporated textual information to produce output
|
||||
- Has attention to draw information from output of previous decoders before drawing from encoders
|
||||
- Both use [Attention](Attention.md)
|
||||
- Both use [[MLP|dense]] layers for additional processing of outputs
|
||||
- Both use [dense](../MLP/MLP.md) layers for additional processing of outputs
|
||||
- Contain residual connections & layer norm steps
|
@ -1,7 +1,7 @@
|
||||
# Three Key Components
|
||||
|
||||
1. Representation
|
||||
- Declarative & Procedural [[Neural Networks#Knowledge|knowledge]]
|
||||
- Declarative & Procedural [knowledge](Neural%20Networks/Neural%20Networks.md#Knowledge)
|
||||
- Typically human-readable symbols
|
||||
2. Reasoning
|
||||
- Ability to solve problems
|
||||
@ -36,13 +36,13 @@ Explanation-based learning uses both
|
||||
## Level of Explanation
|
||||
- Classical has emphasis on building symbolic representations
|
||||
- Models cognition as sequential processing of symbolic representations
|
||||
- [[Properties+Capabilities|Neural nets]] emphasis on parallel distributed processing models
|
||||
- [Neural nets](Neural%20Networks/Properties+Capabilities.md) emphasis on parallel distributed processing models
|
||||
- Models assume information processing takes place through interactions of large numbers of neurons
|
||||
|
||||
## Processing style
|
||||
- Classical processing is sequential
|
||||
- Von Neumann Machine
|
||||
- [[Properties+Capabilities|Neural nets]] use parallelism everywhere
|
||||
- [Neural nets](Neural%20Networks/Properties+Capabilities.md) use parallelism everywhere
|
||||
- Source of flexibility
|
||||
- Robust
|
||||
|
||||
@ -50,7 +50,7 @@ Explanation-based learning uses both
|
||||
- Classical emphasises language of thought
|
||||
- Symbolic representation has quasi-linguistic structure
|
||||
- New symbols created from compositionality
|
||||
- [[Properties+Capabilities|Neural nets]] have problem describing nature and structure of representation
|
||||
- [Neural nets](Neural%20Networks/Properties+Capabilities.md) have problem describing nature and structure of representation
|
||||
|
||||
Symbolic AI is the formal manipulation of a language of algorithms and data representations in a top-down fashion
|
||||
|
||||
|
@ -24,5 +24,5 @@
|
||||
- Adobe Flash Player
|
||||
- Tamarin
|
||||
- JVM
|
||||
- [[Compilers#LLVM|LLVM]]
|
||||
- [LLVM](Compilers.md#LLVM)
|
||||
- Silverlight
|
@ -10,7 +10,7 @@
|
||||
- JIT managed code into machine instructions
|
||||
- Execution engine
|
||||
- VM
|
||||
- [[Language Binding#Virtual Machines]]
|
||||
- [Language Binding](../Language%20Binding.md#Virtual%20Machines)
|
||||
- Services
|
||||
- Memory management
|
||||
- Type safety
|
||||
|
@ -1 +1 @@
|
||||
[[Convolution#Discrete]]
|
||||
[Convolution](../Convolution.md#Discrete)
|
Loading…
Reference in New Issue
Block a user