vault backup: 2023-06-06 11:48:49
Affected files: STEM/AI/Neural Networks/CNN/Examples.md STEM/AI/Neural Networks/CNN/FCN/FCN.md STEM/AI/Neural Networks/CNN/FCN/ResNet.md STEM/AI/Neural Networks/CNN/FCN/Skip Connections.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/Interpretation.md STEM/AI/Neural Networks/CNN/UpConv.md STEM/AI/Neural Networks/Deep Learning.md STEM/AI/Neural Networks/MLP/MLP.md STEM/AI/Neural Networks/Properties+Capabilities.md STEM/AI/Neural Networks/SLP/Least Mean Square.md STEM/AI/Neural Networks/SLP/SLP.md STEM/AI/Neural Networks/Transformers/Transformers.md STEM/AI/Properties.md STEM/CS/Language Binding.md STEM/CS/Languages/dotNet.md STEM/Signal Proc/Image/Image Processing.md
This commit is contained in:
parent
d7ab8f329a
commit
7bc4dffd8b
@ -8,7 +8,7 @@
|
|||||||
# AlexNet
|
# AlexNet
|
||||||
2012
|
2012
|
||||||
|
|
||||||
- [[Activation Functions#ReLu|ReLu]]
|
- [ReLu](../Activation%20Functions.md#ReLu)
|
||||||
- Normalisation
|
- Normalisation
|
||||||
|
|
||||||
![alexnet](../../../img/alexnet.png)
|
![alexnet](../../../img/alexnet.png)
|
||||||
@ -29,13 +29,13 @@
|
|||||||
2015
|
2015
|
||||||
|
|
||||||
- [Inception Layer](Inception%20Layer.md)s
|
- [Inception Layer](Inception%20Layer.md)s
|
||||||
- Multiple [[Deep Learning#Loss Function|Loss]] Functions
|
- Multiple [Loss](../Deep%20Learning.md#Loss%20Function) Functions
|
||||||
|
|
||||||
![googlenet](../../../img/googlenet.png)
|
![googlenet](../../../img/googlenet.png)
|
||||||
|
|
||||||
## [Inception Layer](Inception%20Layer.md)
|
## [Inception Layer](Inception%20Layer.md)
|
||||||
![googlenet-inception](../../../img/googlenet-inception.png)
|
![googlenet-inception](../../../img/googlenet-inception.png)
|
||||||
## Auxiliary [[Deep Learning#Loss Function|Loss]] Functions
|
## Auxiliary [Loss](../Deep%20Learning.md#Loss%20Function) Functions
|
||||||
- Two other SoftMax blocks
|
- Two other SoftMax blocks
|
||||||
- Help train really deep network
|
- Help train really deep network
|
||||||
- Vanishing gradient problem
|
- Vanishing gradient problem
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al Network
|
Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al Network
|
||||||
|
|
||||||
[[Convolutional Layer|Convolutional]] and [[UpConv|up-convolutional layers]] with [[Activation Functions#ReLu|ReLu]] but no others (pooling)
|
[Convolutional](../Convolutional%20Layer.md) and [up-convolutional layers](../UpConv.md) with [ReLu](../../Activation%20Functions.md#ReLu) but no others (pooling)
|
||||||
- All some sort of Encoder-Decoder
|
- All some sort of Encoder-Decoder
|
||||||
|
|
||||||
Contractive → [UpConv](../UpConv.md)
|
Contractive → [UpConv](../UpConv.md)
|
||||||
@ -19,13 +19,13 @@ Contractive → [UpConv](../UpConv.md)
|
|||||||
- Rarely from scratch
|
- Rarely from scratch
|
||||||
- Pre-trained weights
|
- Pre-trained weights
|
||||||
- Replace final layers
|
- Replace final layers
|
||||||
- [[MLP|FC]] layers
|
- [FC](../../MLP/MLP.md) layers
|
||||||
- White-noise initialised
|
- White-noise initialised
|
||||||
- Add [UpConv](../UpConv.md) layer(s)
|
- Add [UpConv](../UpConv.md) layer(s)
|
||||||
- Fine-tune train
|
- Fine-tune train
|
||||||
- Freeze others
|
- Freeze others
|
||||||
- Annotated GT images
|
- Annotated GT images
|
||||||
- Can use summed per-pixel log [[Deep Learning#Loss Function|loss]]
|
- Can use summed per-pixel log [loss](../../Deep%20Learning.md#Loss%20Function)
|
||||||
|
|
||||||
# Evaluation
|
# Evaluation
|
||||||
![fcn-eval](../../../../img/fcn-eval.png)
|
![fcn-eval](../../../../img/fcn-eval.png)
|
||||||
|
@ -12,18 +12,18 @@
|
|||||||
|
|
||||||
# Design
|
# Design
|
||||||
|
|
||||||
- Skips across pairs of [[Convolutional Layer|conv layers]]
|
- Skips across pairs of [conv layers](../Convolutional%20Layer.md)
|
||||||
- Elementwise addition
|
- Elementwise addition
|
||||||
- All layer 3x3 kernel
|
- All layer 3x3 kernel
|
||||||
- Spatial size halves each layer
|
- Spatial size halves each layer
|
||||||
- Filters doubles each layer
|
- Filters doubles each layer
|
||||||
- [[FCN|Fully convolutional]]
|
- [Fully convolutional](FCN.md)
|
||||||
- No fc layer
|
- No fc layer
|
||||||
- No [[Max Pooling|pooling]]
|
- No [pooling](../Max%20Pooling.md)
|
||||||
- Except at end
|
- Except at end
|
||||||
- No dropout
|
- No dropout
|
||||||
|
|
||||||
[[Datasets#ImageNet|ImageNet]] Error:
|
[ImageNet](../../CV/Datasets.md#ImageNet) Error:
|
||||||
![imagenet-error](../../../../img/imagenet-error.png)
|
![imagenet-error](../../../../img/imagenet-error.png)
|
||||||
|
|
||||||
![resnet-arch](../../../../img/resnet-arch.png)
|
![resnet-arch](../../../../img/resnet-arch.png)
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
- Output of [[Convolutional Layer|conv]], c, layers are added to inputs of [UpConv](../UpConv.md), d, layers
|
- Output of [conv](../Convolutional%20Layer.md), c, layers are added to inputs of [UpConv](../UpConv.md), d, layers
|
||||||
- Element-wise, not channel appending
|
- Element-wise, not channel appending
|
||||||
- Propagate high frequency information to later layers
|
- Propagate high frequency information to later layers
|
||||||
- Two types
|
- Two types
|
||||||
|
@ -13,10 +13,10 @@ Deep [Convolutional](../../../../Signal%20Proc/Convolution.md) [GAN](GAN.md)
|
|||||||
- Discriminator
|
- Discriminator
|
||||||
- Contractive
|
- Contractive
|
||||||
- Cross-entropy [loss](../../Deep%20Learning.md#Loss%20Function)
|
- Cross-entropy [loss](../../Deep%20Learning.md#Loss%20Function)
|
||||||
- [Conv](../Convolutional%20Layer.md) and leaky [[Activation Functions#ReLu|ReLu]] layers only
|
- [Conv](../Convolutional%20Layer.md) and leaky [ReLu](../../Activation%20Functions.md#ReLu) layers only
|
||||||
- Normalised output via [[Activation Functions#Sigmoid|sigmoid]]
|
- Normalised output via [sigmoid](../../Activation%20Functions.md#Sigmoid)
|
||||||
|
|
||||||
## [[Deep Learning#Loss Function|Loss]]
|
## [Loss](../../Deep%20Learning.md#Loss%20Function)
|
||||||
$$D(S,L)=-\sum_iL_ilog(S_i)$$
|
$$D(S,L)=-\sum_iL_ilog(S_i)$$
|
||||||
- $S$
|
- $S$
|
||||||
- $(0.1, 0.9)^T$
|
- $(0.1, 0.9)^T$
|
||||||
|
@ -1,12 +1,12 @@
|
|||||||
# Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al
|
# Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al
|
||||||
- Remove [Max Pooling](../Max%20Pooling.md)
|
- Remove [Max Pooling](../Max%20Pooling.md)
|
||||||
- Use strided [UpConv](../UpConv.md)
|
- Use strided [UpConv](../UpConv.md)
|
||||||
- Remove [[MLP|FC]] layers
|
- Remove [FC](../../MLP/MLP.md) layers
|
||||||
- Hurts convergence in non-classification
|
- Hurts convergence in non-classification
|
||||||
- Normalisation tricks
|
- Normalisation tricks
|
||||||
- Batch normalisation
|
- Batch normalisation
|
||||||
- Batches of 0 mean and variance 1
|
- Batches of 0 mean and variance 1
|
||||||
- Leaky [[Activation Functions#ReLu|ReLu]]
|
- Leaky [ReLu](../../Activation%20Functions.md#ReLu)
|
||||||
|
|
||||||
# Stages
|
# Stages
|
||||||
## Generator, G
|
## Generator, G
|
||||||
@ -27,5 +27,5 @@
|
|||||||
|
|
||||||
# Code Vector Math for Control
|
# Code Vector Math for Control
|
||||||
![cvmfc](../../../../img/cvmfc.png)
|
![cvmfc](../../../../img/cvmfc.png)
|
||||||
- Do [[Interpretation#Activation Maximisation|AM]] to derive code for an image
|
- Do [AM](../Interpretation.md#Activation%20Maximisation) to derive code for an image
|
||||||
![code-vector-math-for-control-results](../../../../img/code-vector-math-for-control-results.png)
|
![code-vector-math-for-control-results](../../../../img/code-vector-math-for-control-results.png)
|
@ -1,13 +1,13 @@
|
|||||||
# Activation Maximisation
|
# Activation Maximisation
|
||||||
- Synthesise an ideal image for a class
|
- Synthesise an ideal image for a class
|
||||||
- Maximise 1-hot output
|
- Maximise 1-hot output
|
||||||
- Maximise [[Activation Functions#SoftMax|SoftMax]]
|
- Maximise [SoftMax](../Activation%20Functions.md#SoftMax)
|
||||||
|
|
||||||
![am](../../../img/am.png)
|
![am](../../../img/am.png)
|
||||||
- **Use trained network**
|
- **Use trained network**
|
||||||
- Don't update weights
|
- Don't update weights
|
||||||
- [[Architectures|Feedforward]] noise
|
- [Feedforward](../Architectures.md) noise
|
||||||
- [[Back-Propagation|Back-propagate]] [[Deep Learning#Loss Function|loss]]
|
- [Back-propagate](../MLP/Back-Propagation.md) [loss](../Deep%20Learning.md#Loss%20Function)
|
||||||
- Don't update weights
|
- Don't update weights
|
||||||
- Update image
|
- Update image
|
||||||
|
|
||||||
@ -17,7 +17,7 @@
|
|||||||
- Prone to high frequency noise
|
- Prone to high frequency noise
|
||||||
- Minimise
|
- Minimise
|
||||||
- Total variation
|
- Total variation
|
||||||
- $x^*$ is the best solution to minimise [[Deep Learning#Loss Function|loss]]
|
- $x^*$ is the best solution to minimise [loss](../Deep%20Learning.md#Loss%20Function)
|
||||||
|
|
||||||
$$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_0)$$
|
$$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_0)$$
|
||||||
- Won't work
|
- Won't work
|
||||||
|
@ -9,7 +9,7 @@
|
|||||||
- Could specify kernel
|
- Could specify kernel
|
||||||
- Or learn
|
- Or learn
|
||||||
- Can have multiple upconv layers
|
- Can have multiple upconv layers
|
||||||
- Separated by [[Activation Functions#ReLu|ReLu]]
|
- Separated by [ReLu](../Activation%20Functions.md#ReLu)
|
||||||
- For non-linear up-sampling conv
|
- For non-linear up-sampling conv
|
||||||
- Interpolation is linear
|
- Interpolation is linear
|
||||||
|
|
||||||
|
@ -8,7 +8,7 @@ Objective Function
|
|||||||
![deep-loss-function](../../img/deep-loss-function.png)
|
![deep-loss-function](../../img/deep-loss-function.png)
|
||||||
|
|
||||||
- Test accuracy worse than train accuracy = overfitting
|
- Test accuracy worse than train accuracy = overfitting
|
||||||
- [[MLP|Dense]] = [[MLP|fully connected]]
|
- [Dense](MLP/MLP.md) = [fully connected](MLP/MLP.md)
|
||||||
- Automates feature engineering
|
- Automates feature engineering
|
||||||
|
|
||||||
![ml-dl](../../img/ml-dl.png)
|
![ml-dl](../../img/ml-dl.png)
|
||||||
|
@ -1,15 +1,15 @@
|
|||||||
- [[Architectures|Feedforward]]
|
- [Feedforward](../Architectures.md)
|
||||||
- Single hidden layer can learn any function
|
- Single hidden layer can learn any function
|
||||||
- Universal approximation theorem
|
- Universal approximation theorem
|
||||||
- Each hidden layer can operate as a different feature extraction layer
|
- Each hidden layer can operate as a different feature extraction layer
|
||||||
- Lots of [[Weight Init|weights]] to learn
|
- Lots of [weights](../Weight%20Init.md) to learn
|
||||||
- [Back-Propagation](Back-Propagation.md) is supervised
|
- [Back-Propagation](Back-Propagation.md) is supervised
|
||||||
|
|
||||||
![mlp-arch](../../../img/mlp-arch.png)
|
![mlp-arch](../../../img/mlp-arch.png)
|
||||||
|
|
||||||
# Universal Approximation Theory
|
# Universal Approximation Theory
|
||||||
A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory approximate any mathematical function
|
A finite [feedforward](../Architectures.md) MLP with 1 hidden layer can in theory approximate any mathematical function
|
||||||
- In practice not trainable with [[Back-Propagation|BP]]
|
- In practice not trainable with [BP](Back-Propagation.md)
|
||||||
|
|
||||||
![activation-function](../../../img/activation-function.png)
|
![activation-function](../../../img/activation-function.png)
|
||||||
![mlp-arch-diagram](../../../img/mlp-arch-diagram.png)
|
![mlp-arch-diagram](../../../img/mlp-arch-diagram.png)
|
||||||
@ -19,4 +19,4 @@ A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory app
|
|||||||
![tlu](../../../img/tlu.png)
|
![tlu](../../../img/tlu.png)
|
||||||
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
|
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
|
||||||
![mlp-non-linear-decision](../../../img/mlp-non-linear-decision.png)
|
![mlp-non-linear-decision](../../../img/mlp-non-linear-decision.png)
|
||||||
- Can generate a non-linear [[Decision Boundary|decision boundary]]
|
- Can generate a non-linear [decision boundary](Decision%20Boundary.md)
|
@ -45,7 +45,7 @@
|
|||||||
- Confidence value
|
- Confidence value
|
||||||
|
|
||||||
# Contextual Information
|
# Contextual Information
|
||||||
- [[Neural Networks#Knowledge|Knowledge]] represented by structure and activation weight
|
- [Knowledge](Neural%20Networks.md#Knowledge) represented by structure and activation weight
|
||||||
- Any neuron can be affected by global activity
|
- Any neuron can be affected by global activity
|
||||||
- Contextual information handled naturally
|
- Contextual information handled naturally
|
||||||
|
|
||||||
|
@ -20,7 +20,7 @@ $$\frac{\partial \mathfrak{E}(w)}{\partial w(n)}=-x(n)\cdot e(n)$$
|
|||||||
$$\hat{g}(n)=-x(n)\cdot e(n)$$
|
$$\hat{g}(n)=-x(n)\cdot e(n)$$
|
||||||
$$\hat{w}(n+1)=\hat{w}(n)+\eta \cdot x(n) \cdot e(n)$$
|
$$\hat{w}(n+1)=\hat{w}(n)+\eta \cdot x(n) \cdot e(n)$$
|
||||||
|
|
||||||
- Above is a [[Architectures|feedforward]] loop around weight vector, $\hat{w}$
|
- Above is a [feedforward](../Architectures.md) loop around weight vector, $\hat{w}$
|
||||||
- Behaves like low-pass filter
|
- Behaves like low-pass filter
|
||||||
- Pass low frequency components of error signal
|
- Pass low frequency components of error signal
|
||||||
- Average time constant of filtering action inversely proportional to learning-rate
|
- Average time constant of filtering action inversely proportional to learning-rate
|
||||||
|
@ -4,4 +4,4 @@ $$=w^T(n)x(n)$$
|
|||||||
![slp-hyperplane](../../../img/slp-hyperplane.png)
|
![slp-hyperplane](../../../img/slp-hyperplane.png)
|
||||||
Perceptron learning is performed for a finite number of iteration and then stops
|
Perceptron learning is performed for a finite number of iteration and then stops
|
||||||
|
|
||||||
[[Least Mean Square|LMS]] is continuous learning that doesn't stop
|
[LMS](Least%20Mean%20Square.md) is continuous learning that doesn't stop
|
@ -1,4 +1,4 @@
|
|||||||
- [[Attention|Self-attention]]
|
- [Self-attention](Attention.md)
|
||||||
- Weighting significance of parts of the input
|
- Weighting significance of parts of the input
|
||||||
- Including recursive output
|
- Including recursive output
|
||||||
- Similar to [RNN](../RNN/RNN.md)s
|
- Similar to [RNN](../RNN/RNN.md)s
|
||||||
@ -35,5 +35,5 @@
|
|||||||
- Uses incorporated textual information to produce output
|
- Uses incorporated textual information to produce output
|
||||||
- Has attention to draw information from output of previous decoders before drawing from encoders
|
- Has attention to draw information from output of previous decoders before drawing from encoders
|
||||||
- Both use [Attention](Attention.md)
|
- Both use [Attention](Attention.md)
|
||||||
- Both use [[MLP|dense]] layers for additional processing of outputs
|
- Both use [dense](../MLP/MLP.md) layers for additional processing of outputs
|
||||||
- Contain residual connections & layer norm steps
|
- Contain residual connections & layer norm steps
|
@ -1,7 +1,7 @@
|
|||||||
# Three Key Components
|
# Three Key Components
|
||||||
|
|
||||||
1. Representation
|
1. Representation
|
||||||
- Declarative & Procedural [[Neural Networks#Knowledge|knowledge]]
|
- Declarative & Procedural [knowledge](Neural%20Networks/Neural%20Networks.md#Knowledge)
|
||||||
- Typically human-readable symbols
|
- Typically human-readable symbols
|
||||||
2. Reasoning
|
2. Reasoning
|
||||||
- Ability to solve problems
|
- Ability to solve problems
|
||||||
@ -36,13 +36,13 @@ Explanation-based learning uses both
|
|||||||
## Level of Explanation
|
## Level of Explanation
|
||||||
- Classical has emphasis on building symbolic representations
|
- Classical has emphasis on building symbolic representations
|
||||||
- Models cognition as sequential processing of symbolic representations
|
- Models cognition as sequential processing of symbolic representations
|
||||||
- [[Properties+Capabilities|Neural nets]] emphasis on parallel distributed processing models
|
- [Neural nets](Neural%20Networks/Properties+Capabilities.md) emphasis on parallel distributed processing models
|
||||||
- Models assume information processing takes place through interactions of large numbers of neurons
|
- Models assume information processing takes place through interactions of large numbers of neurons
|
||||||
|
|
||||||
## Processing style
|
## Processing style
|
||||||
- Classical processing is sequential
|
- Classical processing is sequential
|
||||||
- Von Neumann Machine
|
- Von Neumann Machine
|
||||||
- [[Properties+Capabilities|Neural nets]] use parallelism everywhere
|
- [Neural nets](Neural%20Networks/Properties+Capabilities.md) use parallelism everywhere
|
||||||
- Source of flexibility
|
- Source of flexibility
|
||||||
- Robust
|
- Robust
|
||||||
|
|
||||||
@ -50,7 +50,7 @@ Explanation-based learning uses both
|
|||||||
- Classical emphasises language of thought
|
- Classical emphasises language of thought
|
||||||
- Symbolic representation has quasi-linguistic structure
|
- Symbolic representation has quasi-linguistic structure
|
||||||
- New symbols created from compositionality
|
- New symbols created from compositionality
|
||||||
- [[Properties+Capabilities|Neural nets]] have problem describing nature and structure of representation
|
- [Neural nets](Neural%20Networks/Properties+Capabilities.md) have problem describing nature and structure of representation
|
||||||
|
|
||||||
Symbolic AI is the formal manipulation of a language of algorithms and data representations in a top-down fashion
|
Symbolic AI is the formal manipulation of a language of algorithms and data representations in a top-down fashion
|
||||||
|
|
||||||
|
@ -24,5 +24,5 @@
|
|||||||
- Adobe Flash Player
|
- Adobe Flash Player
|
||||||
- Tamarin
|
- Tamarin
|
||||||
- JVM
|
- JVM
|
||||||
- [[Compilers#LLVM|LLVM]]
|
- [LLVM](Compilers.md#LLVM)
|
||||||
- Silverlight
|
- Silverlight
|
@ -10,7 +10,7 @@
|
|||||||
- JIT managed code into machine instructions
|
- JIT managed code into machine instructions
|
||||||
- Execution engine
|
- Execution engine
|
||||||
- VM
|
- VM
|
||||||
- [[Language Binding#Virtual Machines]]
|
- [Language Binding](../Language%20Binding.md#Virtual%20Machines)
|
||||||
- Services
|
- Services
|
||||||
- Memory management
|
- Memory management
|
||||||
- Type safety
|
- Type safety
|
||||||
|
@ -1 +1 @@
|
|||||||
[[Convolution#Discrete]]
|
[Convolution](../Convolution.md#Discrete)
|
Loading…
Reference in New Issue
Block a user