vault backup: 2023-06-05 17:01:29

Affected files:
Money/Assets/Financial Instruments.md
Money/Assets/Security.md
Money/Markets/Markets.md
Politcs/Now.md
STEM/AI/Neural Networks/CNN/Examples.md
STEM/AI/Neural Networks/CNN/FCN/FCN.md
STEM/AI/Neural Networks/CNN/FCN/FlowNet.md
STEM/AI/Neural Networks/CNN/FCN/Highway Networks.md
STEM/AI/Neural Networks/CNN/FCN/ResNet.md
STEM/AI/Neural Networks/CNN/FCN/Skip Connections.md
STEM/AI/Neural Networks/CNN/FCN/Super-Resolution.md
STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md
STEM/AI/Neural Networks/CNN/GAN/GAN.md
STEM/AI/Neural Networks/CNN/GAN/StackGAN.md
STEM/AI/Neural Networks/CNN/Inception Layer.md
STEM/AI/Neural Networks/CNN/Interpretation.md
STEM/AI/Neural Networks/CNN/Max Pooling.md
STEM/AI/Neural Networks/CNN/Normalisation.md
STEM/AI/Neural Networks/CNN/UpConv.md
STEM/AI/Neural Networks/CV/Layer Structure.md
STEM/AI/Neural Networks/MLP/MLP.md
STEM/AI/Neural Networks/Neural Networks.md
STEM/AI/Neural Networks/RNN/LSTM.md
STEM/AI/Neural Networks/RNN/RNN.md
STEM/AI/Neural Networks/RNN/VQA.md
STEM/AI/Neural Networks/SLP/Least Mean Square.md
STEM/AI/Neural Networks/SLP/Perceptron Convergence.md
STEM/AI/Neural Networks/SLP/SLP.md
STEM/AI/Neural Networks/Transformers/LLM.md
STEM/AI/Neural Networks/Transformers/Transformers.md
STEM/AI/Properties.md
STEM/CS/Language Binding.md
STEM/Light.md
STEM/Maths/Tensor.md
STEM/Quantum/Orbitals.md
STEM/Quantum/Schrödinger.md
STEM/Quantum/Standard Model.md
STEM/Quantum/Wave Function.md
Tattoo/Music.md
Tattoo/Plans.md
Tattoo/Sources.md
This commit is contained in:
andy 2023-06-05 17:01:29 +01:00
parent 40f5eca82e
commit d7ab8f329a
34 changed files with 110 additions and 111 deletions

View File

@ -28,7 +28,7 @@
# GoogLeNet
2015
- [[Inception Layer]]s
- [Inception Layer](Inception%20Layer.md)s
- Multiple [[Deep Learning#Loss Function|Loss]] Functions
![googlenet](../../../img/googlenet.png)

View File

@ -1,4 +1,4 @@
Fully [[Convolution]]al Network
Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al Network
[[Convolutional Layer|Convolutional]] and [[UpConv|up-convolutional layers]] with [[Activation Functions#ReLu|ReLu]] but no others (pooling)
- All some sort of Encoder-Decoder
@ -9,12 +9,11 @@ Contractive → [UpConv](../UpConv.md)
- For visual output
- Previously image $\rightarrow$ vector
- Additional layers to up-sample representation to an image
- Up-[[convolution]]al
- De-[[convolution]]al
- Up-[convolution](../../../../Signal%20Proc/Convolution.md)al
- De-[convolution](../../../../Signal%20Proc/Convolution.md)al
![[fcn-uses.png]]
![[fcn-arch.png]]
![fcn-uses](../../../../img/fcn-uses.png)
![fcn-arch](../../../../img/fcn-arch.png)
# Training
- Rarely from scratch
@ -22,7 +21,7 @@ Contractive → [UpConv](../UpConv.md)
- Replace final layers
- [[MLP|FC]] layers
- White-noise initialised
- Add [[upconv]] layer(s)
- Add [UpConv](../UpConv.md) layer(s)
- Fine-tune train
- Freeze others
- Annotated GT images

View File

@ -7,16 +7,16 @@ Optical Flow
![flownet](../../../../img/flownet.png)
# [[Skip Connections]]
# [Skip Connections](Skip%20Connections.md)
- Further through the network information is condensed
- Less high frequency information
- Link encoder layers to [[upconv]] layers
- Link encoder layers to [UpConv](../UpConv.md) layers
- Append activation maps from encoder to decoder
# Encode
![flownet-encode](../../../../img/flownet-encode.png)
# [[Upconv]]
# [UpConv](../UpConv.md)
![flownet-upconv](../../../../img/flownet-upconv.png)
# Training

View File

@ -1,9 +1,9 @@
- [[Skip connections]] across individual layers
- [Skip Connections](Skip%20Connections.md) across individual layers
- Conditionally
- Soft gates
- Learn vs carry
- Gradients propagate further
- Inspired by [[LSTM]] [[RNN]]s
- Inspired by [LSTM](../../RNN/LSTM.md) [RNN](../../RNN/RNN.md)s
![[highway-vs-residual.png]]
![[skip-connections 1.png]]
![highway-vs-residual](../../../../img/highway-vs-residual.png)
![skip-connections 1](../../../../img/skip-connections%201.png)

View File

@ -24,7 +24,7 @@
- No dropout
[[Datasets#ImageNet|ImageNet]] Error:
![[imagenet-error.png]]
![imagenet-error](../../../../img/imagenet-error.png)
![[resnet-arch.png]]
![[resnet-arch2.png]]
![resnet-arch](../../../../img/resnet-arch.png)
![resnet-arch2](../../../../img/resnet-arch2.png)

View File

@ -1,16 +1,16 @@
- Output of [[Convolutional Layer|conv]], c, layers are added to inputs of [[upconv]], d, layers
- Output of [[Convolutional Layer|conv]], c, layers are added to inputs of [UpConv](../UpConv.md), d, layers
- Element-wise, not channel appending
- Propagate high frequency information to later layers
- Two types
- Additive
- [[ResNet]]
- [[Super-resolution]] auto-encoder
- [ResNet](ResNet.md)
- [Super-Resolution](Super-Resolution.md) auto-encoder
- Concatenative
- Densely connected architectures
- DenseNet
- [[FlowNet]]
- [FlowNet](FlowNet.md)
![[STEM/img/skip-connections.png]]
![skip-connections](../../../../img/skip-connections.png)
[AI Summer - Skip Connections](https://theaisummer.com/skip-connections/)
[Arxiv - Visualising the Loss Landscape](https://arxiv.org/abs/1712.09913)

View File

@ -7,6 +7,6 @@
- Unsupervised?
- Decoder stage
- Identical architecture to encoder
![[super-res.png]]
![super-res](../../../../img/super-res.png)
- Is actually contractive/up sampling
![[superres-results.png]]
![superres-results](../../../../img/superres-results.png)

View File

@ -12,8 +12,8 @@ Deep [Convolutional](../../../../Signal%20Proc/Convolution.md) [GAN](GAN.md)
- Train using Gaussian random noise for code
- Discriminator
- Contractive
- Cross-entropy [[Deep Learning#Loss Function|loss]]
- [[Convolutional Layer|Conv]] and leaky [[Activation Functions#ReLu|ReLu]] layers only
- Cross-entropy [loss](../../Deep%20Learning.md#Loss%20Function)
- [Conv](../Convolutional%20Layer.md) and leaky [[Activation Functions#ReLu|ReLu]] layers only
- Normalised output via [[Activation Functions#Sigmoid|sigmoid]]
## [[Deep Learning#Loss Function|Loss]]

View File

@ -1,6 +1,6 @@
# Fully [[Convolution]]al
- Remove [[Max Pooling]]
- Use strided [[upconv]]
# Fully [Convolution](../../../../Signal%20Proc/Convolution.md)al
- Remove [Max Pooling](../Max%20Pooling.md)
- Use strided [UpConv](../UpConv.md)
- Remove [[MLP|FC]] layers
- Hurts convergence in non-classification
- Normalisation tricks
@ -16,16 +16,16 @@
- Discriminator is a classifier
- Is image fake or real
![[gan-arch.png]]
![[gan-arch2.png]]
![gan-arch](../../../../img/gan-arch.png)
![gan-arch2](../../../../img/gan-arch2.png)
![[gan-results.png]]
![gan-results](../../../../img/gan-results.png)]
# Training
![[gan-training-discriminator.png]]
![[gan-training-generator.png]]
![gan-training-discriminator](../../../../img/gan-training-discriminator.png)
![gan-training-generator](../../../../img/gan-training-generator.png)
# Code Vector Math for Control
![[cvmfc.png]]
![cvmfc](../../../../img/cvmfc.png)
- Do [[Interpretation#Activation Maximisation|AM]] to derive code for an image
![[code-vector-math-for-control-results.png]]
![code-vector-math-for-control-results](../../../../img/code-vector-math-for-control-results.png)

View File

@ -1,6 +1,6 @@
- Feed output from synthesis into up-res network
- Generate standard low-res image
- Feed into [[cGAN]]
- Feed into [cGAN](cGAN.md)
![[stackgan.png]]
![[stackgan-results.png]]
![stackgan](../../../../img/stackgan.png)
![stackgan-results](../../../../img/stackgan-results.png)

View File

@ -3,8 +3,8 @@
- Couple of different scales
- Concatenate results
![[inception-layer-effect.png]]
![[inception-layer-arch.png]]
![inception-layer-effect](../../../img/inception-layer-effect.png)
![inception-layer-arch](../../../img/inception-layer-arch.png)
- 1 x 1
- Averages over channels

View File

@ -3,7 +3,7 @@
- Maximise 1-hot output
- Maximise [[Activation Functions#SoftMax|SoftMax]]
![[am.png]]
![am](../../../img/am.png)
- **Use trained network**
- Don't update weights
- [[Architectures|Feedforward]] noise
@ -11,7 +11,7 @@
- Don't update weights
- Update image
![[am-process.png]]
![am-process](../../../img/am-process.png)
## Regulariser
- Fit to natural image statistics
- Prone to high frequency noise
@ -24,7 +24,7 @@ $$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_
$$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_0)+\lambda\mathcal R(x)$$
- Need a regulariser like above
![[am-regulariser.png]]
![am-regulariser](../../../img/am-regulariser.png)
$$\mathcal R_{V^\beta}(f)=\int_\Omega\left(\left(\frac{\partial f}{\partial u}(u,v)\right)^2+\left(\frac{\partial f}{\partial v}(u,v)\right)^2\right)^{\frac \beta 2}du\space dv$$

View File

@ -5,7 +5,7 @@
- Max value is the good bit
- No parameters
![[max-pooling.png]]
![max-pooling](../../../img/max-pooling.png)
## Design Parameters
- Size of input image

View File

@ -2,4 +2,4 @@
- Apply kernel to same location of all channels
- Pixels in window divided by sum of pixel within volume across channels
![[cnn-normalisation.png]]
![cnn-normalisation](../../../img/cnn-normalisation.png)

View File

@ -1,11 +1,11 @@
- Fractionally strided convolution
- Transposed [[convolution]]
- Transposed [Convolution](../../../Signal%20Proc/Convolution.md)
- Like a deep interpolation
- Convolution with a fractional input stride
- Up-sampling is convolution 'in reverse'
- Not an actual inverse convolution
- For scaling up by a factor of $f$
- Consider as a [[convolution]] of stride $1/f$
- Consider as a [Convolution](../../../Signal%20Proc/Convolution.md) of stride $1/f$
- Could specify kernel
- Or learn
- Can have multiple upconv layers
@ -13,23 +13,23 @@
- For non-linear up-sampling conv
- Interpolation is linear
![[upconv.png]]
![upconv](../../../img/upconv.png)
# Convolution Matrix
Normal
![[upconv-matrix.png]]
![upconv-matrix](../../../img/upconv-matrix.png)
- Equivalent operation with a flattened input
- Row per kernel location
- Many-to-one operation
![[upconv-matrix-result.png]]
![upconv-matrix-result](../../../img/upconv-matrix-result.png)
[Understanding transposed convolutions](https://www.machinecurve.com/index.php/2019/09/29/understanding-transposed-convolutions/)
## Transposed
![[upconv-transposed-matrix.png]]
![upconv-transposed-matrix](../../../img/upconv-transposed-matrix.png)
- One-to-many
![[upconv-matrix-transposed-result.png]]
![upconv-matrix-transposed-result](../../../img/upconv-matrix-transposed-result.png)

View File

@ -1 +1 @@
![[cnn-cv-layer-arch.png]]
![cnn-cv-layer-arch](../../../img/cnn-cv-layer-arch.png)

View File

@ -3,20 +3,20 @@
- Universal approximation theorem
- Each hidden layer can operate as a different feature extraction layer
- Lots of [[Weight Init|weights]] to learn
- [[Back-Propagation]] is supervised
- [Back-Propagation](Back-Propagation.md) is supervised
![[mlp-arch.png]]
![mlp-arch](../../../img/mlp-arch.png)
# Universal Approximation Theory
A finite [[Architectures|feedforward]] MLP with 1 hidden layer can in theory approximate any mathematical function
- In practice not trainable with [[Back-Propagation|BP]]
![[activation-function.png]]
![[mlp-arch-diagram.png]]
![activation-function](../../../img/activation-function.png)
![mlp-arch-diagram](../../../img/mlp-arch-diagram.png)
## Weight Matrix
- Use matrix multiplication for layer output
- TLU is hard limiter
![[tlu.png]]
![tlu](../../../img/tlu.png)
- $o_1$ to $o_4$ must all be one to overcome -3.5 bias and force output to 1
![[mlp-non-linear-decision.png]]
![mlp-non-linear-decision](../../../img/mlp-non-linear-decision.png)
- Can generate a non-linear [[Decision Boundary|decision boundary]]

View File

@ -7,7 +7,7 @@
- Interneuron connection strengths store acquired knowledge
- Synaptic weights
![[slp-arch.png]]
![slp-arch](../../img/slp-arch.png)
A neural network is a directed graph consisting of nodes with interconnecting synaptic and activation links, and is characterised by four properties

View File

@ -1,7 +1,7 @@
Long Short Term Memory
- More general form of [[RNN]]
- More general form of [RNN](RNN.md)
- Explicitly encode memory state, C
![[lstm.png]]
![[lstm-slp.png]]
![lstm](../../../img/lstm.png)
![lstm-slp](../../../img/lstm-slp.png)

View File

@ -13,5 +13,5 @@ Recurrent Neural Network
- In practice suffers from vanishing gradient
- Can't extract precise information about previous tokens
![[rnn-input.png]]
![[rnn-recurrence.png]]
![rnn-input](../../../img/rnn-input.png)
![rnn-recurrence](../../../img/rnn-recurrence.png)

View File

@ -1,17 +1,17 @@
Visual Question Answering
- Combine visual with text sequence
- [[CNN]] + [[LSTM]]
- [CNN](../CNN/CNN.md) + [LSTM](LSTM.md)
- Generate text from images
- Automatic scene description
- Cross-modal
![[cnn+lstm.png]]
![cnn+lstm](../../../img/cnn+lstm.png)
- Word embedding not character
# Freeform
- Encode facts with two text streams
![[vqa-block.png]]
![vqa-block](../../../img/vqa-block.png)
# Limitations
- Repetitive answers
- Not much variation

View File

@ -59,16 +59,16 @@ $$\hat{w}(n+1)=\hat{w}(n)+\eta \cdot x(n) \cdot e(n)$$
- Sensitivity to variation in eigenstructure of input
- Typically requires iterations of 10 x dimensionality of the input space
- Worse with high-d input spaces
![[slp-mse.png]]
![slp-mse](../../../img/slp-mse.png)
- Use steepest descent
- Partial derivatives
![[slp-steepest-descent.png]]
![slp-steepest-descent](../../../img/slp-steepest-descent.png)
- Can be solved by matrix inversion
- Stochastic
- Random progress
- Will overall improve
![[lms-algorithm.png]]
![lms-algorithm](../../../img/lms-algorithm.png)
$$\hat{w}(n+1)=\hat{w}(n)+\eta\cdot x(n)\cdot[d(n)-x^T(n)\cdot\hat w(n)]$$
$$=[I-\eta\cdot x(n)x^T(n)]\cdot\hat{w}(n)+\eta\cdot x(n)\cdot d(n)$$
@ -76,6 +76,6 @@ $$=[I-\eta\cdot x(n)x^T(n)]\cdot\hat{w}(n)+\eta\cdot x(n)\cdot d(n)$$
Where
$$\hat w(n)=z^{-1}[\hat w(n+1)]$$
## Independence Theory
![[slp-lms-independence.png]]
![slp-lms-independence](../../../img/slp-lms-independence.png)
![[sl-lms-summary.png]]
![sl-lms-summary](../../../img/sl-lms-summary.png)

View File

@ -39,4 +39,4 @@ $$
2. Fast adaptation with respect to real changes in the underlying distribution of process responsible for $x$
- Large eta
![[slp-separable.png]]
![slp-separable](../../../img/slp-separable.png)

View File

@ -1,7 +1,7 @@
![[slp-arch.png]]
![slp-arch](../../../img/slp-arch.png)
$$v(n)=\sum_{i=0}^{m}w_i(n)x_i(n)$$
$$=w^T(n)x(n)$$
![[slp-hyperplane.png]]
![slp-hyperplane](../../../img/slp-hyperplane.png)
Perceptron learning is performed for a finite number of iteration and then stops
[[Least Mean Square|LMS]] is continuous learning that doesn't stop

View File

@ -8,9 +8,9 @@
## Hallucination
# Architectures
Mostly [[Transformers]]
Mostly [Transformers](Transformers.md)
## GPT
Generative Pre-trained [[Transformers]]
Generative Pre-trained [Transformers](Transformers.md)
![[llm-family-tree.png]]
![llm-family-tree](../../../img/llm-family-tree.png)

View File

@ -1,15 +1,15 @@
- [[Attention|Self-attention]]
- Weighting significance of parts of the input
- Including recursive output
- Similar to [[RNN]]s
- Similar to [RNN](../RNN/RNN.md)s
- Process sequential data
- Translation & text summarisation
- Differences
- Process input all at once
- Largely replaced [[LSTM]] and gated recurrent units (GRU) which had attention mechanics
- Largely replaced [LSTM](../RNN/LSTM.md) and gated recurrent units (GRU) which had attention mechanics
- No recurrent structure
![[transformer-arch.png]]
![transformer-arch](../../../img/transformer-arch.png)
## Examples
- BERT
@ -34,6 +34,6 @@
- Takes encodings and does opposite
- Uses incorporated textual information to produce output
- Has attention to draw information from output of previous decoders before drawing from encoders
- Both use [[attention]]
- Both use [Attention](Attention.md)
- Both use [[MLP|dense]] layers for additional processing of outputs
- Contain residual connections & layer norm steps

View File

@ -16,7 +16,7 @@ An AI system must be able to
2. Apply knowledge to solve problems
3. Acquire new knowledge through experience
![[ai-nested-subjects.png]]
![ai-nested-subjects](../img/ai-nested-subjects.png)
# Expert Systems
- Usually easier to obtain compiled experience from experts than duplicate experience that made them experts for network
@ -56,4 +56,4 @@ Symbolic AI is the formal manipulation of a language of algorithms and data repr
Neural nets bottom-up
![[ai-io.png]]
![ai-io](../img/ai-io.png)

View File

@ -5,18 +5,18 @@
### Object Models
- COM
- [[C++]]
- [C++](Languages/C++.md)
- Component Object Model
- MS only cross-language model
- CLI
- [[dotNet]]
- [dotNet](Languages/dotNet.md)
- .NET Common Language Infrastructure
- Freedesktop.org D-Bus
- Open cross-platform-language model
### Virtual Machines
- CLR
- [[dotNet]]
- [dotNet](Languages/dotNet.md)
- .NET Common Language Runtime
- Mono
- CLI languages

View File

@ -9,9 +9,9 @@ $$E=hf$$
2. There is a minimum frequency n0below which there is no emission
3. No time delay (less than 1 ns) before the onset of emission but the rate of electrons depends on the intensity.
![[photo-electric.png]]
![[fermi-vacuum-level.png]]
![photo-electric](img/photo-electric.png)
![fermi-vacuum-level](img/fermi-vacuum-level.png)
![[em-spectrum.png]]
![em-spectrum](img/em-spectrum.png)
- Radio spectrum
- 30Hz 300 GHz

View File

@ -11,7 +11,7 @@
- Cube matrix
Matrices are not inherently rank-2 tensors. Matrices are just the formatting structure. The tensor described by the matrix must follow the transformation rules to be a tensor
![[tensor.png]]
![tensor](../img/tensor.png)
# Transformation Rules
1. Transforms like a tensor

View File

@ -1,4 +1,4 @@
[[Wave Function]]
[Wave Function](Wave%20Function.md)
## Quantum Numbers
$$n$$
@ -17,7 +17,7 @@ Z-component / Magentic of $l$
- $-l$ to $+l$
- ***Orientation*** of orbital
![[wave-function-polar-segment.png]]
![wave-function-polar-segment](../img/wave-function-polar-segment.png)
## Filling
@ -30,11 +30,11 @@ Z-component / Magentic of $l$
- Orbitals with same energy filled one at a time
- Degenerate
![[orbitals-radius.png]]
![[wave-function-nodes.png]]
![orbitals-radius](../img/orbitals-radius.png)
![wave-function-nodes](../img/wave-function-nodes.png)
## Radial
![[radial-equations.png]]
![radial-equations](../img/radial-equations.png)
- Z = Atomic number
- Bohr radius
@ -42,4 +42,4 @@ Z-component / Magentic of $l$
- Normalisation
- $\int_0^\infty r^2R_{nl}^*R_{nl}dr=1$
![[radius-electron-density-wf.png]]
![radius-electron-density-wf](../img/radius-electron-density-wf.png)

View File

@ -1,6 +1,6 @@
$$-\frac{\hbar^2}{2m}\nabla^2\psi+V\psi=E\psi$$
- Time Independent
- $\psi$ is the [[Wave Function]]
- $\psi$ is the [Wave Function](Wave%20Function.md)
Quantum counterpart of Newton's second law in classical mechanics

View File

@ -1,4 +1,4 @@
![[model-table.png]]
![model-table](../img/model-table.png)
- 4 fundamental forces
- Bosons
- Elementary particles
@ -22,5 +22,5 @@
- Force carriers
- y, W, Z, g
![[boson-interactions-feynman.png]]
![[boson-interactions.png]]
![boson-interactions-feynman](../img/boson-interactions-feynman.png)
![boson-interactions](../img/boson-interactions.png)

View File

@ -5,17 +5,17 @@ Radial Function
Spherical Harmonic
- $Y_{ml}(\theta, \phi)$
Forms [[Orbitals]]
Forms [Orbitals](Orbitals.md)
Absolute value of wave function squared gives probability density of finding electron inside differential volume $dV$ centred on $r, \theta, \phi$
$$|\psi(r,\theta,\phi)|^2$$
![[wave-function-polar.png]]
![wave-function-polar](../img/wave-function-polar.png)
![[hydrogen-wave-function.png]]
![[wave-function-polar-segment.png]]
![[wave-function-nodes.png]]
![hydrogen-wave-function](../img/hydrogen-wave-function.png)
![wave-function-polar-segment](../img/wave-function-polar-segment.png)
![wave-function-nodes](../img/wave-function-nodes.png)
![[hydrogen-electron-density.png]]
![hydrogen-electron-density](../img/hydrogen-electron-density.png)
![[radius-electron-density-wf.png]]
![radius-electron-density-wf](../img/radius-electron-density-wf.png)