vault backup: 2023-06-05 17:01:29
Affected files: Money/Assets/Financial Instruments.md Money/Assets/Security.md Money/Markets/Markets.md Politcs/Now.md STEM/AI/Neural Networks/CNN/Examples.md STEM/AI/Neural Networks/CNN/FCN/FCN.md STEM/AI/Neural Networks/CNN/FCN/FlowNet.md STEM/AI/Neural Networks/CNN/FCN/Highway Networks.md STEM/AI/Neural Networks/CNN/FCN/ResNet.md STEM/AI/Neural Networks/CNN/FCN/Skip Connections.md STEM/AI/Neural Networks/CNN/FCN/Super-Resolution.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/GAN/StackGAN.md STEM/AI/Neural Networks/CNN/Inception Layer.md STEM/AI/Neural Networks/CNN/Interpretation.md STEM/AI/Neural Networks/CNN/Max Pooling.md STEM/AI/Neural Networks/CNN/Normalisation.md STEM/AI/Neural Networks/CNN/UpConv.md STEM/AI/Neural Networks/CV/Layer Structure.md STEM/AI/Neural Networks/MLP/MLP.md STEM/AI/Neural Networks/Neural Networks.md STEM/AI/Neural Networks/RNN/LSTM.md STEM/AI/Neural Networks/RNN/RNN.md STEM/AI/Neural Networks/RNN/VQA.md STEM/AI/Neural Networks/SLP/Least Mean Square.md STEM/AI/Neural Networks/SLP/Perceptron Convergence.md STEM/AI/Neural Networks/SLP/SLP.md STEM/AI/Neural Networks/Transformers/LLM.md STEM/AI/Neural Networks/Transformers/Transformers.md STEM/AI/Properties.md STEM/CS/Language Binding.md STEM/Light.md STEM/Maths/Tensor.md STEM/Quantum/Orbitals.md STEM/Quantum/Schrödinger.md STEM/Quantum/Standard Model.md STEM/Quantum/Wave Function.md Tattoo/Music.md Tattoo/Plans.md Tattoo/Sources.md
This commit is contained in:
parent
40f5eca82e
commit
d7ab8f329a
@ -28,7 +28,7 @@
|
|||||||
# GoogLeNet
|
# GoogLeNet
|
||||||
2015
|
2015
|
||||||
|
|
||||||
- [[Inception Layer]]s
|
- [Inception Layer](Inception%20Layer.md)s
|
||||||
- Multiple [[Deep Learning#Loss Function|Loss]] Functions
|
- Multiple [[Deep Learning#Loss Function|Loss]] Functions
|
||||||
|
|
||||||
![googlenet](../../../img/googlenet.png)
|
![googlenet](../../../img/googlenet.png)
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
Fully [[Convolution]]al Network
|
Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al Network
|
||||||
|
|
||||||
[[Convolutional Layer|Convolutional]] and [[UpConv|up-convolutional layers]] with [[Activation Functions#ReLu|ReLu]] but no others (pooling)
|
[[Convolutional Layer|Convolutional]] and [[UpConv|up-convolutional layers]] with [[Activation Functions#ReLu|ReLu]] but no others (pooling)
|
||||||
- All some sort of Encoder-Decoder
|
- All some sort of Encoder-Decoder
|
||||||
@ -9,12 +9,11 @@ Contractive → [UpConv](../UpConv.md)
|
|||||||
- For visual output
|
- For visual output
|
||||||
- Previously image $\rightarrow$ vector
|
- Previously image $\rightarrow$ vector
|
||||||
- Additional layers to up-sample representation to an image
|
- Additional layers to up-sample representation to an image
|
||||||
- Up-[[convolution]]al
|
- Up-[convolution](../../../../Signal%20Proc/Convolution.md)al
|
||||||
- De-[[convolution]]al
|
- De-[convolution](../../../../Signal%20Proc/Convolution.md)al
|
||||||
|
|
||||||
![[fcn-uses.png]]
|
![fcn-uses](../../../../img/fcn-uses.png)
|
||||||
|
![fcn-arch](../../../../img/fcn-arch.png)
|
||||||
![[fcn-arch.png]]
|
|
||||||
|
|
||||||
# Training
|
# Training
|
||||||
- Rarely from scratch
|
- Rarely from scratch
|
||||||
@ -22,7 +21,7 @@ Contractive → [UpConv](../UpConv.md)
|
|||||||
- Replace final layers
|
- Replace final layers
|
||||||
- [[MLP|FC]] layers
|
- [[MLP|FC]] layers
|
||||||
- White-noise initialised
|
- White-noise initialised
|
||||||
- Add [[upconv]] layer(s)
|
- Add [UpConv](../UpConv.md) layer(s)
|
||||||
- Fine-tune train
|
- Fine-tune train
|
||||||
- Freeze others
|
- Freeze others
|
||||||
- Annotated GT images
|
- Annotated GT images
|
||||||
|
@ -7,16 +7,16 @@ Optical Flow
|
|||||||
|
|
||||||
![flownet](../../../../img/flownet.png)
|
![flownet](../../../../img/flownet.png)
|
||||||
|
|
||||||
# [[Skip Connections]]
|
# [Skip Connections](Skip%20Connections.md)
|
||||||
- Further through the network information is condensed
|
- Further through the network information is condensed
|
||||||
- Less high frequency information
|
- Less high frequency information
|
||||||
- Link encoder layers to [[upconv]] layers
|
- Link encoder layers to [UpConv](../UpConv.md) layers
|
||||||
- Append activation maps from encoder to decoder
|
- Append activation maps from encoder to decoder
|
||||||
|
|
||||||
# Encode
|
# Encode
|
||||||
![flownet-encode](../../../../img/flownet-encode.png)
|
![flownet-encode](../../../../img/flownet-encode.png)
|
||||||
|
|
||||||
# [[Upconv]]
|
# [UpConv](../UpConv.md)
|
||||||
![flownet-upconv](../../../../img/flownet-upconv.png)
|
![flownet-upconv](../../../../img/flownet-upconv.png)
|
||||||
|
|
||||||
# Training
|
# Training
|
||||||
|
@ -1,9 +1,9 @@
|
|||||||
- [[Skip connections]] across individual layers
|
- [Skip Connections](Skip%20Connections.md) across individual layers
|
||||||
- Conditionally
|
- Conditionally
|
||||||
- Soft gates
|
- Soft gates
|
||||||
- Learn vs carry
|
- Learn vs carry
|
||||||
- Gradients propagate further
|
- Gradients propagate further
|
||||||
- Inspired by [[LSTM]] [[RNN]]s
|
- Inspired by [LSTM](../../RNN/LSTM.md) [RNN](../../RNN/RNN.md)s
|
||||||
|
|
||||||
![[highway-vs-residual.png]]
|
![highway-vs-residual](../../../../img/highway-vs-residual.png)
|
||||||
![[skip-connections 1.png]]
|
![skip-connections 1](../../../../img/skip-connections%201.png)
|
@ -24,7 +24,7 @@
|
|||||||
- No dropout
|
- No dropout
|
||||||
|
|
||||||
[[Datasets#ImageNet|ImageNet]] Error:
|
[[Datasets#ImageNet|ImageNet]] Error:
|
||||||
![[imagenet-error.png]]
|
![imagenet-error](../../../../img/imagenet-error.png)
|
||||||
|
|
||||||
![[resnet-arch.png]]
|
![resnet-arch](../../../../img/resnet-arch.png)
|
||||||
![[resnet-arch2.png]]
|
![resnet-arch2](../../../../img/resnet-arch2.png)
|
@ -1,16 +1,16 @@
|
|||||||
- Output of [[Convolutional Layer|conv]], c, layers are added to inputs of [[upconv]], d, layers
|
- Output of [[Convolutional Layer|conv]], c, layers are added to inputs of [UpConv](../UpConv.md), d, layers
|
||||||
- Element-wise, not channel appending
|
- Element-wise, not channel appending
|
||||||
- Propagate high frequency information to later layers
|
- Propagate high frequency information to later layers
|
||||||
- Two types
|
- Two types
|
||||||
- Additive
|
- Additive
|
||||||
- [[ResNet]]
|
- [ResNet](ResNet.md)
|
||||||
- [[Super-resolution]] auto-encoder
|
- [Super-Resolution](Super-Resolution.md) auto-encoder
|
||||||
- Concatenative
|
- Concatenative
|
||||||
- Densely connected architectures
|
- Densely connected architectures
|
||||||
- DenseNet
|
- DenseNet
|
||||||
- [[FlowNet]]
|
- [FlowNet](FlowNet.md)
|
||||||
|
|
||||||
![[STEM/img/skip-connections.png]]
|
![skip-connections](../../../../img/skip-connections.png)
|
||||||
|
|
||||||
[AI Summer - Skip Connections](https://theaisummer.com/skip-connections/)
|
[AI Summer - Skip Connections](https://theaisummer.com/skip-connections/)
|
||||||
[Arxiv - Visualising the Loss Landscape](https://arxiv.org/abs/1712.09913)
|
[Arxiv - Visualising the Loss Landscape](https://arxiv.org/abs/1712.09913)
|
@ -7,6 +7,6 @@
|
|||||||
- Unsupervised?
|
- Unsupervised?
|
||||||
- Decoder stage
|
- Decoder stage
|
||||||
- Identical architecture to encoder
|
- Identical architecture to encoder
|
||||||
![[super-res.png]]
|
![super-res](../../../../img/super-res.png)
|
||||||
- Is actually contractive/up sampling
|
- Is actually contractive/up sampling
|
||||||
![[superres-results.png]]
|
![superres-results](../../../../img/superres-results.png)
|
@ -12,8 +12,8 @@ Deep [Convolutional](../../../../Signal%20Proc/Convolution.md) [GAN](GAN.md)
|
|||||||
- Train using Gaussian random noise for code
|
- Train using Gaussian random noise for code
|
||||||
- Discriminator
|
- Discriminator
|
||||||
- Contractive
|
- Contractive
|
||||||
- Cross-entropy [[Deep Learning#Loss Function|loss]]
|
- Cross-entropy [loss](../../Deep%20Learning.md#Loss%20Function)
|
||||||
- [[Convolutional Layer|Conv]] and leaky [[Activation Functions#ReLu|ReLu]] layers only
|
- [Conv](../Convolutional%20Layer.md) and leaky [[Activation Functions#ReLu|ReLu]] layers only
|
||||||
- Normalised output via [[Activation Functions#Sigmoid|sigmoid]]
|
- Normalised output via [[Activation Functions#Sigmoid|sigmoid]]
|
||||||
|
|
||||||
## [[Deep Learning#Loss Function|Loss]]
|
## [[Deep Learning#Loss Function|Loss]]
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Fully [[Convolution]]al
|
# Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al
|
||||||
- Remove [[Max Pooling]]
|
- Remove [Max Pooling](../Max%20Pooling.md)
|
||||||
- Use strided [[upconv]]
|
- Use strided [UpConv](../UpConv.md)
|
||||||
- Remove [[MLP|FC]] layers
|
- Remove [[MLP|FC]] layers
|
||||||
- Hurts convergence in non-classification
|
- Hurts convergence in non-classification
|
||||||
- Normalisation tricks
|
- Normalisation tricks
|
||||||
@ -16,16 +16,16 @@
|
|||||||
- Discriminator is a classifier
|
- Discriminator is a classifier
|
||||||
- Is image fake or real
|
- Is image fake or real
|
||||||
|
|
||||||
![[gan-arch.png]]
|
![gan-arch](../../../../img/gan-arch.png)
|
||||||
![[gan-arch2.png]]
|
![gan-arch2](../../../../img/gan-arch2.png)
|
||||||
|
|
||||||
![[gan-results.png]]
|
![gan-results](../../../../img/gan-results.png)]
|
||||||
|
|
||||||
# Training
|
# Training
|
||||||
![[gan-training-discriminator.png]]
|
![gan-training-discriminator](../../../../img/gan-training-discriminator.png)
|
||||||
![[gan-training-generator.png]]
|
![gan-training-generator](../../../../img/gan-training-generator.png)
|
||||||
|
|
||||||
# Code Vector Math for Control
|
# Code Vector Math for Control
|
||||||
![[cvmfc.png]]
|
![cvmfc](../../../../img/cvmfc.png)
|
||||||
- Do [[Interpretation#Activation Maximisation|AM]] to derive code for an image
|
- Do [[Interpretation#Activation Maximisation|AM]] to derive code for an image
|
||||||
![[code-vector-math-for-control-results.png]]
|
![code-vector-math-for-control-results](../../../../img/code-vector-math-for-control-results.png)
|
@ -1,6 +1,6 @@
|
|||||||
- Feed output from synthesis into up-res network
|
- Feed output from synthesis into up-res network
|
||||||
- Generate standard low-res image
|
- Generate standard low-res image
|
||||||
- Feed into [[cGAN]]
|
- Feed into [cGAN](cGAN.md)
|
||||||
|
|
||||||
![[stackgan.png]]
|
![stackgan](../../../../img/stackgan.png)
|
||||||
![[stackgan-results.png]]
|
![stackgan-results](../../../../img/stackgan-results.png)
|
@ -3,8 +3,8 @@
|
|||||||
- Couple of different scales
|
- Couple of different scales
|
||||||
- Concatenate results
|
- Concatenate results
|
||||||
|
|
||||||
![[inception-layer-effect.png]]
|
![inception-layer-effect](../../../img/inception-layer-effect.png)
|
||||||
![[inception-layer-arch.png]]
|
![inception-layer-arch](../../../img/inception-layer-arch.png)
|
||||||
|
|
||||||
- 1 x 1
|
- 1 x 1
|
||||||
- Averages over channels
|
- Averages over channels
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
- Maximise 1-hot output
|
- Maximise 1-hot output
|
||||||
- Maximise [[Activation Functions#SoftMax|SoftMax]]
|
- Maximise [[Activation Functions#SoftMax|SoftMax]]
|
||||||
|
|
||||||
![[am.png]]
|
![am](../../../img/am.png)
|
||||||
- **Use trained network**
|
- **Use trained network**
|
||||||
- Don't update weights
|
- Don't update weights
|
||||||
- [[Architectures|Feedforward]] noise
|
- [[Architectures|Feedforward]] noise
|
||||||
@ -11,7 +11,7 @@
|
|||||||
- Don't update weights
|
- Don't update weights
|
||||||
- Update image
|
- Update image
|
||||||
|
|
||||||
![[am-process.png]]
|
![am-process](../../../img/am-process.png)
|
||||||
## Regulariser
|
## Regulariser
|
||||||
- Fit to natural image statistics
|
- Fit to natural image statistics
|
||||||
- Prone to high frequency noise
|
- Prone to high frequency noise
|
||||||
@ -24,7 +24,7 @@ $$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_
|
|||||||
$$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_0)+\lambda\mathcal R(x)$$
|
$$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_0)+\lambda\mathcal R(x)$$
|
||||||
- Need a regulariser like above
|
- Need a regulariser like above
|
||||||
|
|
||||||
![[am-regulariser.png]]
|
![am-regulariser](../../../img/am-regulariser.png)
|
||||||
|
|
||||||
$$\mathcal R_{V^\beta}(f)=\int_\Omega\left(\left(\frac{\partial f}{\partial u}(u,v)\right)^2+\left(\frac{\partial f}{\partial v}(u,v)\right)^2\right)^{\frac \beta 2}du\space dv$$
|
$$\mathcal R_{V^\beta}(f)=\int_\Omega\left(\left(\frac{\partial f}{\partial u}(u,v)\right)^2+\left(\frac{\partial f}{\partial v}(u,v)\right)^2\right)^{\frac \beta 2}du\space dv$$
|
||||||
|
|
||||||
|
@ -5,7 +5,7 @@
|
|||||||
- Max value is the good bit
|
- Max value is the good bit
|
||||||
- No parameters
|
- No parameters
|
||||||
|
|
||||||
![[max-pooling.png]]
|
![max-pooling](../../../img/max-pooling.png)
|
||||||
|
|
||||||
## Design Parameters
|
## Design Parameters
|
||||||
- Size of input image
|
- Size of input image
|
||||||
|
@ -2,4 +2,4 @@
|
|||||||
- Apply kernel to same location of all channels
|
- Apply kernel to same location of all channels
|
||||||
- Pixels in window divided by sum of pixel within volume across channels
|
- Pixels in window divided by sum of pixel within volume across channels
|
||||||
|
|
||||||
![[cnn-normalisation.png]]
|
![cnn-normalisation](../../../img/cnn-normalisation.png)
|
@ -1,11 +1,11 @@
|
|||||||
- Fractionally strided convolution
|
- Fractionally strided convolution
|
||||||
- Transposed [[convolution]]
|
- Transposed [Convolution](../../../Signal%20Proc/Convolution.md)
|
||||||
- Like a deep interpolation
|
- Like a deep interpolation
|
||||||
- Convolution with a fractional input stride
|
- Convolution with a fractional input stride
|
||||||
- Up-sampling is convolution 'in reverse'
|
- Up-sampling is convolution 'in reverse'
|
||||||
- Not an actual inverse convolution
|
- Not an actual inverse convolution
|
||||||
- For scaling up by a factor of $f$
|
- For scaling up by a factor of $f$
|
||||||
- Consider as a [[convolution]] of stride $1/f$
|
- Consider as a [Convolution](../../../Signal%20Proc/Convolution.md) of stride $1/f$
|
||||||
- Could specify kernel
|
- Could specify kernel
|
||||||
- Or learn
|
- Or learn
|
||||||
- Can have multiple upconv layers
|
- Can have multiple upconv layers
|
||||||
@ -13,23 +13,23 @@
|
|||||||
- For non-linear up-sampling conv
|
- For non-linear up-sampling conv
|
||||||
- Interpolation is linear
|
- Interpolation is linear
|
||||||
|
|
||||||
![[upconv.png]]
|
![upconv](../../../img/upconv.png)
|
||||||
|
|
||||||
# Convolution Matrix
|
# Convolution Matrix
|
||||||
Normal
|
Normal
|
||||||
|
|
||||||
![[upconv-matrix.png]]
|
![upconv-matrix](../../../img/upconv-matrix.png)
|
||||||
|
|
||||||
- Equivalent operation with a flattened input
|
- Equivalent operation with a flattened input
|
||||||
- Row per kernel location
|
- Row per kernel location
|
||||||
- Many-to-one operation
|
- Many-to-one operation
|
||||||
|
|
||||||
![[upconv-matrix-result.png]]
|
![upconv-matrix-result](../../../img/upconv-matrix-result.png)
|
||||||
|
|
||||||
[Understanding transposed convolutions](https://www.machinecurve.com/index.php/2019/09/29/understanding-transposed-convolutions/)
|
[Understanding transposed convolutions](https://www.machinecurve.com/index.php/2019/09/29/understanding-transposed-convolutions/)
|
||||||
|
|
||||||
## Transposed
|
## Transposed
|
||||||
![[upconv-transposed-matrix.png]]
|
![upconv-transposed-matrix](../../../img/upconv-transposed-matrix.png)
|
||||||
- One-to-many
|
- One-to-many
|
||||||
|
|
||||||
![[upconv-matrix-transposed-result.png]]
|
![upconv-matrix-transposed-result](../../../img/upconv-matrix-transposed-result.png)
|
@ -1 +1 @@
|
|||||||
![[cnn-cv-layer-arch.png]]
|
![cnn-cv-layer-arch](../../../img/cnn-cv-layer-arch.png)
|
@ -3,20 +3,20 @@
|
|||||||
- Universal approximation theorem
|
- Universal approximation theorem
|
||||||
- Each hidden layer can operate as a different feature extraction layer
|
- Each hidden layer can operate as a different feature extraction layer
|
||||||
- Lots of [[Weight Init|weights]] to learn
|
- Lots of [[Weight Init|weights]] to learn
|
||||||
- [[Back-Propagation]] is supervised
|
- [Back-Propagation](Back-Propagation.md) is supervised
|
||||||
|
|
||||||
![[mlp-arch.png]]
|
![mlp-arch](../../../img/mlp-arch.png)
|
||||||
|
|
||||||
# Universal Approximation Theory
|
# Universal Approximation Theory
|
||||||
A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory approximate any mathematical function
|
A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory approximate any mathematical function
|
||||||
- In practice not trainable with [[Back-Propagation|BP]]
|
- In practice not trainable with [[Back-Propagation|BP]]
|
||||||
|
|
||||||
![[activation-function.png]]
|
![activation-function](../../../img/activation-function.png)
|
||||||
![[mlp-arch-diagram.png]]
|
![mlp-arch-diagram](../../../img/mlp-arch-diagram.png)
|
||||||
## Weight Matrix
|
## Weight Matrix
|
||||||
- Use matrix multiplication for layer output
|
- Use matrix multiplication for layer output
|
||||||
- TLU is hard limiter
|
- TLU is hard limiter
|
||||||
![[tlu.png]]
|
![tlu](../../../img/tlu.png)
|
||||||
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
|
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
|
||||||
![[mlp-non-linear-decision.png]]
|
![mlp-non-linear-decision](../../../img/mlp-non-linear-decision.png)
|
||||||
- Can generate a non-linear [[Decision Boundary|decision boundary]]
|
- Can generate a non-linear [[Decision Boundary|decision boundary]]
|
@ -7,7 +7,7 @@
|
|||||||
- Interneuron connection strengths store acquired knowledge
|
- Interneuron connection strengths store acquired knowledge
|
||||||
- Synaptic weights
|
- Synaptic weights
|
||||||
|
|
||||||
![[slp-arch.png]]
|
![slp-arch](../../img/slp-arch.png)
|
||||||
|
|
||||||
A neural network is a directed graph consisting of nodes with interconnecting synaptic and activation links, and is characterised by four properties
|
A neural network is a directed graph consisting of nodes with interconnecting synaptic and activation links, and is characterised by four properties
|
||||||
|
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
Long Short Term Memory
|
Long Short Term Memory
|
||||||
|
|
||||||
- More general form of [[RNN]]
|
- More general form of [RNN](RNN.md)
|
||||||
- Explicitly encode memory state, C
|
- Explicitly encode memory state, C
|
||||||
|
|
||||||
![[lstm.png]]
|
![lstm](../../../img/lstm.png)
|
||||||
![[lstm-slp.png]]
|
![lstm-slp](../../../img/lstm-slp.png)
|
@ -13,5 +13,5 @@ Recurrent Neural Network
|
|||||||
- In practice suffers from vanishing gradient
|
- In practice suffers from vanishing gradient
|
||||||
- Can't extract precise information about previous tokens
|
- Can't extract precise information about previous tokens
|
||||||
|
|
||||||
![[rnn-input.png]]
|
![rnn-input](../../../img/rnn-input.png)
|
||||||
![[rnn-recurrence.png]]
|
![rnn-recurrence](../../../img/rnn-recurrence.png)
|
@ -1,17 +1,17 @@
|
|||||||
Visual Question Answering
|
Visual Question Answering
|
||||||
|
|
||||||
- Combine visual with text sequence
|
- Combine visual with text sequence
|
||||||
- [[CNN]] + [[LSTM]]
|
- [CNN](../CNN/CNN.md) + [LSTM](LSTM.md)
|
||||||
- Generate text from images
|
- Generate text from images
|
||||||
- Automatic scene description
|
- Automatic scene description
|
||||||
- Cross-modal
|
- Cross-modal
|
||||||
|
|
||||||
![[cnn+lstm.png]]
|
![cnn+lstm](../../../img/cnn+lstm.png)
|
||||||
- Word embedding not character
|
- Word embedding not character
|
||||||
|
|
||||||
# Freeform
|
# Freeform
|
||||||
- Encode facts with two text streams
|
- Encode facts with two text streams
|
||||||
![[vqa-block.png]]
|
![vqa-block](../../../img/vqa-block.png)
|
||||||
# Limitations
|
# Limitations
|
||||||
- Repetitive answers
|
- Repetitive answers
|
||||||
- Not much variation
|
- Not much variation
|
||||||
|
@ -59,16 +59,16 @@ $$\hat{w}(n+1)=\hat{w}(n)+\eta \cdot x(n) \cdot e(n)$$
|
|||||||
- Sensitivity to variation in eigenstructure of input
|
- Sensitivity to variation in eigenstructure of input
|
||||||
- Typically requires iterations of 10 x dimensionality of the input space
|
- Typically requires iterations of 10 x dimensionality of the input space
|
||||||
- Worse with high-d input spaces
|
- Worse with high-d input spaces
|
||||||
![[slp-mse.png]]
|
![slp-mse](../../../img/slp-mse.png)
|
||||||
- Use steepest descent
|
- Use steepest descent
|
||||||
- Partial derivatives
|
- Partial derivatives
|
||||||
![[slp-steepest-descent.png]]
|
![slp-steepest-descent](../../../img/slp-steepest-descent.png)
|
||||||
- Can be solved by matrix inversion
|
- Can be solved by matrix inversion
|
||||||
- Stochastic
|
- Stochastic
|
||||||
- Random progress
|
- Random progress
|
||||||
- Will overall improve
|
- Will overall improve
|
||||||
|
|
||||||
![[lms-algorithm.png]]
|
![lms-algorithm](../../../img/lms-algorithm.png)
|
||||||
|
|
||||||
$$\hat{w}(n+1)=\hat{w}(n)+\eta\cdot x(n)\cdot[d(n)-x^T(n)\cdot\hat w(n)]$$
|
$$\hat{w}(n+1)=\hat{w}(n)+\eta\cdot x(n)\cdot[d(n)-x^T(n)\cdot\hat w(n)]$$
|
||||||
$$=[I-\eta\cdot x(n)x^T(n)]\cdot\hat{w}(n)+\eta\cdot x(n)\cdot d(n)$$
|
$$=[I-\eta\cdot x(n)x^T(n)]\cdot\hat{w}(n)+\eta\cdot x(n)\cdot d(n)$$
|
||||||
@ -76,6 +76,6 @@ $$=[I-\eta\cdot x(n)x^T(n)]\cdot\hat{w}(n)+\eta\cdot x(n)\cdot d(n)$$
|
|||||||
Where
|
Where
|
||||||
$$\hat w(n)=z^{-1}[\hat w(n+1)]$$
|
$$\hat w(n)=z^{-1}[\hat w(n+1)]$$
|
||||||
## Independence Theory
|
## Independence Theory
|
||||||
![[slp-lms-independence.png]]
|
![slp-lms-independence](../../../img/slp-lms-independence.png)
|
||||||
|
|
||||||
![[sl-lms-summary.png]]
|
![sl-lms-summary](../../../img/sl-lms-summary.png)
|
@ -39,4 +39,4 @@ $$
|
|||||||
2. Fast adaptation with respect to real changes in the underlying distribution of process responsible for $x$
|
2. Fast adaptation with respect to real changes in the underlying distribution of process responsible for $x$
|
||||||
- Large eta
|
- Large eta
|
||||||
|
|
||||||
![[slp-separable.png]]
|
![slp-separable](../../../img/slp-separable.png)
|
@ -1,7 +1,7 @@
|
|||||||
![[slp-arch.png]]
|
![slp-arch](../../../img/slp-arch.png)
|
||||||
$$v(n)=\sum_{i=0}^{m}w_i(n)x_i(n)$$
|
$$v(n)=\sum_{i=0}^{m}w_i(n)x_i(n)$$
|
||||||
$$=w^T(n)x(n)$$
|
$$=w^T(n)x(n)$$
|
||||||
![[slp-hyperplane.png]]
|
![slp-hyperplane](../../../img/slp-hyperplane.png)
|
||||||
Perceptron learning is performed for a finite number of iteration and then stops
|
Perceptron learning is performed for a finite number of iteration and then stops
|
||||||
|
|
||||||
[[Least Mean Square|LMS]] is continuous learning that doesn't stop
|
[[Least Mean Square|LMS]] is continuous learning that doesn't stop
|
@ -8,9 +8,9 @@
|
|||||||
## Hallucination
|
## Hallucination
|
||||||
|
|
||||||
# Architectures
|
# Architectures
|
||||||
Mostly [[Transformers]]
|
Mostly [Transformers](Transformers.md)
|
||||||
|
|
||||||
## GPT
|
## GPT
|
||||||
Generative Pre-trained [[Transformers]]
|
Generative Pre-trained [Transformers](Transformers.md)
|
||||||
|
|
||||||
![[llm-family-tree.png]]
|
![llm-family-tree](../../../img/llm-family-tree.png)
|
@ -1,15 +1,15 @@
|
|||||||
- [[Attention|Self-attention]]
|
- [[Attention|Self-attention]]
|
||||||
- Weighting significance of parts of the input
|
- Weighting significance of parts of the input
|
||||||
- Including recursive output
|
- Including recursive output
|
||||||
- Similar to [[RNN]]s
|
- Similar to [RNN](../RNN/RNN.md)s
|
||||||
- Process sequential data
|
- Process sequential data
|
||||||
- Translation & text summarisation
|
- Translation & text summarisation
|
||||||
- Differences
|
- Differences
|
||||||
- Process input all at once
|
- Process input all at once
|
||||||
- Largely replaced [[LSTM]] and gated recurrent units (GRU) which had attention mechanics
|
- Largely replaced [LSTM](../RNN/LSTM.md) and gated recurrent units (GRU) which had attention mechanics
|
||||||
- No recurrent structure
|
- No recurrent structure
|
||||||
|
|
||||||
![[transformer-arch.png]]
|
![transformer-arch](../../../img/transformer-arch.png)
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
- BERT
|
- BERT
|
||||||
@ -34,6 +34,6 @@
|
|||||||
- Takes encodings and does opposite
|
- Takes encodings and does opposite
|
||||||
- Uses incorporated textual information to produce output
|
- Uses incorporated textual information to produce output
|
||||||
- Has attention to draw information from output of previous decoders before drawing from encoders
|
- Has attention to draw information from output of previous decoders before drawing from encoders
|
||||||
- Both use [[attention]]
|
- Both use [Attention](Attention.md)
|
||||||
- Both use [[MLP|dense]] layers for additional processing of outputs
|
- Both use [[MLP|dense]] layers for additional processing of outputs
|
||||||
- Contain residual connections & layer norm steps
|
- Contain residual connections & layer norm steps
|
@ -16,7 +16,7 @@ An AI system must be able to
|
|||||||
2. Apply knowledge to solve problems
|
2. Apply knowledge to solve problems
|
||||||
3. Acquire new knowledge through experience
|
3. Acquire new knowledge through experience
|
||||||
|
|
||||||
![[ai-nested-subjects.png]]
|
![ai-nested-subjects](../img/ai-nested-subjects.png)
|
||||||
|
|
||||||
# Expert Systems
|
# Expert Systems
|
||||||
- Usually easier to obtain compiled experience from experts than duplicate experience that made them experts for network
|
- Usually easier to obtain compiled experience from experts than duplicate experience that made them experts for network
|
||||||
@ -56,4 +56,4 @@ Symbolic AI is the formal manipulation of a language of algorithms and data repr
|
|||||||
|
|
||||||
Neural nets bottom-up
|
Neural nets bottom-up
|
||||||
|
|
||||||
![[ai-io.png]]
|
![ai-io](../img/ai-io.png)
|
@ -5,18 +5,18 @@
|
|||||||
|
|
||||||
### Object Models
|
### Object Models
|
||||||
- COM
|
- COM
|
||||||
- [[C++]]
|
- [C++](Languages/C++.md)
|
||||||
- Component Object Model
|
- Component Object Model
|
||||||
- MS only cross-language model
|
- MS only cross-language model
|
||||||
- CLI
|
- CLI
|
||||||
- [[dotNet]]
|
- [dotNet](Languages/dotNet.md)
|
||||||
- .NET Common Language Infrastructure
|
- .NET Common Language Infrastructure
|
||||||
- Freedesktop.org D-Bus
|
- Freedesktop.org D-Bus
|
||||||
- Open cross-platform-language model
|
- Open cross-platform-language model
|
||||||
|
|
||||||
### Virtual Machines
|
### Virtual Machines
|
||||||
- CLR
|
- CLR
|
||||||
- [[dotNet]]
|
- [dotNet](Languages/dotNet.md)
|
||||||
- .NET Common Language Runtime
|
- .NET Common Language Runtime
|
||||||
- Mono
|
- Mono
|
||||||
- CLI languages
|
- CLI languages
|
||||||
|
6
Light.md
6
Light.md
@ -9,9 +9,9 @@ $$E=hf$$
|
|||||||
2. There is a minimum frequency n0below which there is no emission
|
2. There is a minimum frequency n0below which there is no emission
|
||||||
3. No time delay (less than 1 ns) before the onset of emission –but the rate of electrons depends on the intensity.
|
3. No time delay (less than 1 ns) before the onset of emission –but the rate of electrons depends on the intensity.
|
||||||
|
|
||||||
![[photo-electric.png]]
|
![photo-electric](img/photo-electric.png)
|
||||||
![[fermi-vacuum-level.png]]
|
![fermi-vacuum-level](img/fermi-vacuum-level.png)
|
||||||
|
|
||||||
![[em-spectrum.png]]
|
![em-spectrum](img/em-spectrum.png)
|
||||||
- Radio spectrum
|
- Radio spectrum
|
||||||
- 30Hz – 300 GHz
|
- 30Hz – 300 GHz
|
@ -11,7 +11,7 @@
|
|||||||
- Cube matrix
|
- Cube matrix
|
||||||
|
|
||||||
Matrices are not inherently rank-2 tensors. Matrices are just the formatting structure. The tensor described by the matrix must follow the transformation rules to be a tensor
|
Matrices are not inherently rank-2 tensors. Matrices are just the formatting structure. The tensor described by the matrix must follow the transformation rules to be a tensor
|
||||||
![[tensor.png]]
|
![tensor](../img/tensor.png)
|
||||||
# Transformation Rules
|
# Transformation Rules
|
||||||
|
|
||||||
1. Transforms like a tensor
|
1. Transforms like a tensor
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
[[Wave Function]]
|
[Wave Function](Wave%20Function.md)
|
||||||
|
|
||||||
## Quantum Numbers
|
## Quantum Numbers
|
||||||
$$n$$
|
$$n$$
|
||||||
@ -17,7 +17,7 @@ Z-component / Magentic of $l$
|
|||||||
- $-l$ to $+l$
|
- $-l$ to $+l$
|
||||||
- ***Orientation*** of orbital
|
- ***Orientation*** of orbital
|
||||||
|
|
||||||
![[wave-function-polar-segment.png]]
|
![wave-function-polar-segment](../img/wave-function-polar-segment.png)
|
||||||
|
|
||||||
## Filling
|
## Filling
|
||||||
|
|
||||||
@ -30,11 +30,11 @@ Z-component / Magentic of $l$
|
|||||||
- Orbitals with same energy filled one at a time
|
- Orbitals with same energy filled one at a time
|
||||||
- Degenerate
|
- Degenerate
|
||||||
|
|
||||||
![[orbitals-radius.png]]
|
![orbitals-radius](../img/orbitals-radius.png)
|
||||||
![[wave-function-nodes.png]]
|
![wave-function-nodes](../img/wave-function-nodes.png)
|
||||||
|
|
||||||
## Radial
|
## Radial
|
||||||
![[radial-equations.png]]
|
![radial-equations](../img/radial-equations.png)
|
||||||
|
|
||||||
- Z = Atomic number
|
- Z = Atomic number
|
||||||
- Bohr radius
|
- Bohr radius
|
||||||
@ -42,4 +42,4 @@ Z-component / Magentic of $l$
|
|||||||
- Normalisation
|
- Normalisation
|
||||||
- $\int_0^\infty r^2R_{nl}^*R_{nl}dr=1$
|
- $\int_0^\infty r^2R_{nl}^*R_{nl}dr=1$
|
||||||
|
|
||||||
![[radius-electron-density-wf.png]]
|
![radius-electron-density-wf](../img/radius-electron-density-wf.png)
|
@ -1,6 +1,6 @@
|
|||||||
$$-\frac{\hbar^2}{2m}\nabla^2\psi+V\psi=E\psi$$
|
$$-\frac{\hbar^2}{2m}\nabla^2\psi+V\psi=E\psi$$
|
||||||
- Time Independent
|
- Time Independent
|
||||||
- $\psi$ is the [[Wave Function]]
|
- $\psi$ is the [Wave Function](Wave%20Function.md)
|
||||||
|
|
||||||
Quantum counterpart of Newton's second law in classical mechanics
|
Quantum counterpart of Newton's second law in classical mechanics
|
||||||
|
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
![[model-table.png]]
|
![model-table](../img/model-table.png)
|
||||||
- 4 fundamental forces
|
- 4 fundamental forces
|
||||||
- Bosons
|
- Bosons
|
||||||
- Elementary particles
|
- Elementary particles
|
||||||
@ -22,5 +22,5 @@
|
|||||||
- Force carriers
|
- Force carriers
|
||||||
- y, W, Z, g
|
- y, W, Z, g
|
||||||
|
|
||||||
![[boson-interactions-feynman.png]]
|
![boson-interactions-feynman](../img/boson-interactions-feynman.png)
|
||||||
![[boson-interactions.png]]
|
![boson-interactions](../img/boson-interactions.png)
|
@ -5,17 +5,17 @@ Radial Function
|
|||||||
Spherical Harmonic
|
Spherical Harmonic
|
||||||
- $Y_{ml}(\theta, \phi)$
|
- $Y_{ml}(\theta, \phi)$
|
||||||
|
|
||||||
Forms [[Orbitals]]
|
Forms [Orbitals](Orbitals.md)
|
||||||
|
|
||||||
Absolute value of wave function squared gives probability density of finding electron inside differential volume $dV$ centred on $r, \theta, \phi$
|
Absolute value of wave function squared gives probability density of finding electron inside differential volume $dV$ centred on $r, \theta, \phi$
|
||||||
|
|
||||||
$$|\psi(r,\theta,\phi)|^2$$
|
$$|\psi(r,\theta,\phi)|^2$$
|
||||||
![[wave-function-polar.png]]
|
![wave-function-polar](../img/wave-function-polar.png)
|
||||||
|
|
||||||
![[hydrogen-wave-function.png]]
|
![hydrogen-wave-function](../img/hydrogen-wave-function.png)
|
||||||
![[wave-function-polar-segment.png]]
|
![wave-function-polar-segment](../img/wave-function-polar-segment.png)
|
||||||
![[wave-function-nodes.png]]
|
![wave-function-nodes](../img/wave-function-nodes.png)
|
||||||
|
|
||||||
![[hydrogen-electron-density.png]]
|
![hydrogen-electron-density](../img/hydrogen-electron-density.png)
|
||||||
|
|
||||||
![[radius-electron-density-wf.png]]
|
![radius-electron-density-wf](../img/radius-electron-density-wf.png)
|
Loading…
Reference in New Issue
Block a user