vault backup: 2023-05-27 22:17:56
Affected files: .obsidian/graph.json .obsidian/workspace-mobile.json .obsidian/workspace.json STEM/AI/Neural Networks/Activation Functions.md STEM/AI/Neural Networks/CNN/FCN/FlowNet.md STEM/AI/Neural Networks/CNN/FCN/ResNet.md STEM/AI/Neural Networks/CNN/FCN/Skip Connections.md STEM/AI/Neural Networks/CNN/GAN/DC-GAN.md STEM/AI/Neural Networks/CNN/GAN/GAN.md STEM/AI/Neural Networks/CNN/Interpretation.md STEM/AI/Neural Networks/Deep Learning.md STEM/AI/Neural Networks/MLP/Back-Propagation.md STEM/AI/Neural Networks/MLP/MLP.md STEM/AI/Neural Networks/Transformers/Attention.md STEM/CS/ABI.md STEM/CS/Calling Conventions.md STEM/CS/Code Types.md STEM/CS/Language Binding.md STEM/img/am-regulariser.png STEM/img/skip-connections.png
This commit is contained in:
parent
acb7dc429e
commit
33ac3007bc
@ -38,7 +38,7 @@ y_j(n)(1-y_j(n))$$
|
|||||||
- Nice derivative
|
- Nice derivative
|
||||||
- Max value of $\varphi_j'(v_j(n))$ occurs when $y_j(n)=0.5$
|
- Max value of $\varphi_j'(v_j(n))$ occurs when $y_j(n)=0.5$
|
||||||
- Min value of 0 when $y_j=0$ or $1$
|
- Min value of 0 when $y_j=0$ or $1$
|
||||||
- Initial weights chosen so not saturated at 0 or 1
|
- Initial [[Weight Init|weights]] chosen so not saturated at 0 or 1
|
||||||
|
|
||||||
If $y=\frac u v$
|
If $y=\frac u v$
|
||||||
Where $u$ and $v$ are differential functions
|
Where $u$ and $v$ are differential functions
|
||||||
|
@ -3,7 +3,7 @@ Optical Flow
|
|||||||
- 2-Channel optical flow
|
- 2-Channel optical flow
|
||||||
- $dx,dy$
|
- $dx,dy$
|
||||||
- Two consecutive frames
|
- Two consecutive frames
|
||||||
- 6-channel tensor
|
- 6-channel [[tensor]]
|
||||||
|
|
||||||
![[flownet.png]]
|
![[flownet.png]]
|
||||||
|
|
||||||
|
25
AI/Neural Networks/CNN/FCN/ResNet.md
Normal file
25
AI/Neural Networks/CNN/FCN/ResNet.md
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
- Residual networks
|
||||||
|
- 152 layers
|
||||||
|
- Skips every two layers
|
||||||
|
- Residual block
|
||||||
|
- Later layers learning the identity function
|
||||||
|
- Skips help
|
||||||
|
- Deep network should be at least as good as shallower one by allowing some layers to do very little
|
||||||
|
- Vanishing gradient
|
||||||
|
- Allows shortcut paths for gradients
|
||||||
|
- Accuracy saturation
|
||||||
|
- Adding more layers to suitably deep network increases training error
|
||||||
|
|
||||||
|
# Design
|
||||||
|
|
||||||
|
- Skips across pairs of conv layers
|
||||||
|
- Elementwise addition
|
||||||
|
- All layer 3x3 kernel
|
||||||
|
- Spatial size halves each layer
|
||||||
|
- Filters doubles each layer
|
||||||
|
- Fully convolutional
|
||||||
|
- No fc layer
|
||||||
|
- No pooling
|
||||||
|
- Except at end
|
||||||
|
- No dropout
|
||||||
|
|
16
AI/Neural Networks/CNN/FCN/Skip Connections.md
Normal file
16
AI/Neural Networks/CNN/FCN/Skip Connections.md
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
- Output of conv, c, layers are added to inputs of upconv, d, layers
|
||||||
|
- Element-wise, not channel appending
|
||||||
|
- Propagate high frequency information to later layers
|
||||||
|
- Two types
|
||||||
|
- Additive
|
||||||
|
- Resnet
|
||||||
|
- Super-resolution auto-encoder
|
||||||
|
- Concatenative
|
||||||
|
- Densely connected architectures
|
||||||
|
- DenseNet
|
||||||
|
- FlowNet
|
||||||
|
|
||||||
|
![[skip-connections.png]]
|
||||||
|
|
||||||
|
[AI Summer - Skip Connections](https://theaisummer.com/skip-connections/)
|
||||||
|
[Arxiv - Visualising the Loss Landscape](https://arxiv.org/abs/1712.09913)aaaaa
|
@ -7,7 +7,7 @@ Deep Convolutional [[GAN]]
|
|||||||
- Generate image from code
|
- Generate image from code
|
||||||
- Low-dimensional
|
- Low-dimensional
|
||||||
- ~100-D
|
- ~100-D
|
||||||
- Reshape to tensor
|
- Reshape to [[tensor]]
|
||||||
- [[Upconv]] to image
|
- [[Upconv]] to image
|
||||||
- Train using Gaussian random noise for code
|
- Train using Gaussian random noise for code
|
||||||
- Discriminator
|
- Discriminator
|
||||||
|
@ -27,5 +27,5 @@
|
|||||||
|
|
||||||
# Code Vector Math for Control
|
# Code Vector Math for Control
|
||||||
![[cvmfc.png]]
|
![[cvmfc.png]]
|
||||||
- Do AM to derive code for an image
|
- Do [[Interpretation#Activation Maximisation|AM]] to derive code for an image
|
||||||
![[code-vector-math-for-control-results.png]]
|
![[code-vector-math-for-control-results.png]]
|
@ -17,4 +17,17 @@
|
|||||||
- Prone to high frequency noise
|
- Prone to high frequency noise
|
||||||
- Minimise
|
- Minimise
|
||||||
- Total variation
|
- Total variation
|
||||||
- $x^*$ is the best solution to minimise [[Deep Learning#Loss Function|loss]]
|
- $x^*$ is the best solution to minimise [[Deep Learning#Loss Function|loss]]
|
||||||
|
|
||||||
|
$$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_0)$$
|
||||||
|
- Won't work
|
||||||
|
$$x^*=\text{argmin}_{x\in \mathbb R^{H\times W\times C}}\mathcal l(\phi(x),\phi_0)+\lambda\mathcal R(x)$$
|
||||||
|
- Need a regulariser like above
|
||||||
|
|
||||||
|
![[am-regulariser.png]]
|
||||||
|
|
||||||
|
$$\mathcal R_{V^\beta}(f)=\int_\Omega\left(\left(\frac{\partial f}{\partial u}(u,v)\right)^2+\left(\frac{\partial f}{\partial v}(u,v)\right)^2\right)^{\frac \beta 2}du\space dv$$
|
||||||
|
|
||||||
|
$$\mathcal R_{V^\beta}(x)=\sum_{i,j}\left(\left(x_{i,j+1}-x_{ij}\right)^2+\left(x_{i+1,j}-x_{ij}\right)^2\right)^{\frac \beta 2}$$
|
||||||
|
- Beta
|
||||||
|
- Degree of smoothing
|
@ -32,16 +32,16 @@ Predict
|
|||||||
Evaluate
|
Evaluate
|
||||||
|
|
||||||
# Data Structure
|
# Data Structure
|
||||||
- Tensor flow = channels last
|
- [[Tensor]] flow = channels last
|
||||||
- (samples, height, width, channels)
|
- (samples, height, width, channels)
|
||||||
- Vector data
|
- Vector data
|
||||||
- 2D tensors of shape (samples, features)
|
- 2D [[tensor]]s of shape (samples, features)
|
||||||
- Time series data or sequence data
|
- Time series data or sequence data
|
||||||
- 3D tensors of shape (samples, timesteps, features)
|
- 3D [[tensor]]s of shape (samples, timesteps, features)
|
||||||
- Images
|
- Images
|
||||||
- 4D tensors of shape (samples, height, width, channels) or (samples, channels, height, Width)
|
- 4D [[tensor]]s of shape (samples, height, width, channels) or (samples, channels, height, Width)
|
||||||
- Video
|
- Video
|
||||||
- 5D tensors of shape (samples, frames, height, width, channels) or (samples, frames, channels , height, width)
|
- 5D [[tensor]]s of shape (samples, frames, height, width, channels) or (samples, frames, channels , height, width)
|
||||||
|
|
||||||
![[photo-tensor.png]]
|
![[photo-tensor.png]]
|
||||||
![[matrix-dot-product.png]]
|
![[matrix-dot-product.png]]
|
@ -79,7 +79,7 @@ $$\Delta w_{ji}(n)=\eta\cdot\delta_j(n)\cdot y_i(n)$$
|
|||||||
2. Error WRT output $y$
|
2. Error WRT output $y$
|
||||||
3. Output $y$ WRT Pre-activation function sum
|
3. Output $y$ WRT Pre-activation function sum
|
||||||
4. Pre-activation function sum WRT weight
|
4. Pre-activation function sum WRT weight
|
||||||
- Other weights constant, goes to zero
|
- Other [[Weight Init|weights]] constant, goes to zero
|
||||||
- Leaves just $y_i$
|
- Leaves just $y_i$
|
||||||
- Collect 3 boxed terms as delta $j$
|
- Collect 3 boxed terms as delta $j$
|
||||||
- Local gradient
|
- Local gradient
|
||||||
|
@ -2,7 +2,7 @@
|
|||||||
- Single hidden layer can learn any function
|
- Single hidden layer can learn any function
|
||||||
- Universal approximation theorem
|
- Universal approximation theorem
|
||||||
- Each hidden layer can operate as a different feature extraction layer
|
- Each hidden layer can operate as a different feature extraction layer
|
||||||
- Lots of weights to learn
|
- Lots of [[Weight Init|weights]] to learn
|
||||||
- [[Back-Propagation]] is supervised
|
- [[Back-Propagation]] is supervised
|
||||||
|
|
||||||
![[mlp-arch.png]]
|
![[mlp-arch.png]]
|
||||||
|
@ -19,7 +19,7 @@
|
|||||||
|
|
||||||
# Scaled Dot-Product
|
# Scaled Dot-Product
|
||||||
- Calculate attention weights between all tokens at once
|
- Calculate attention weights between all tokens at once
|
||||||
- Learn 3 weight matrices
|
- Learn 3 [[Weight Init|weight]] matrices
|
||||||
- Query
|
- Query
|
||||||
- $W_Q$
|
- $W_Q$
|
||||||
- Key
|
- Key
|
||||||
|
@ -31,5 +31,5 @@
|
|||||||
# Embedded ABI
|
# Embedded ABI
|
||||||
- File format, data types, register usage, stack frame organisation, function parameter passing conventions
|
- File format, data types, register usage, stack frame organisation, function parameter passing conventions
|
||||||
- For embedded OS
|
- For embedded OS
|
||||||
- Compilers create object code compatible with code from other compilers
|
- [[Compilers]] create object code compatible with code from other [[compilers]]
|
||||||
- Link libraries from different compilers
|
- Link libraries from different [[compilers]]
|
@ -5,15 +5,15 @@
|
|||||||
- Also known as: callee-saved registers or non-volatile registers
|
- Also known as: callee-saved registers or non-volatile registers
|
||||||
- How the task of preparing the stack for, and restoring after, a function call is divided between the caller and the callee
|
- How the task of preparing the stack for, and restoring after, a function call is divided between the caller and the callee
|
||||||
|
|
||||||
Subtle differences between compilers, can be difficult to interface codes from different compilers
|
Subtle differences between [[compilers]], can be difficult to interface codes from different [[compilers]]
|
||||||
|
|
||||||
Calling conventions, type representations, and name mangling are all part of what is known as an [application binary interface](https://en.wikipedia.org/wiki/Application_binary_interface) ([[ABI]])
|
Calling conventions, type representations, and name mangling are all part of what is known as an [application binary interface](https://en.wikipedia.org/wiki/Application_binary_interface) ([[ABI]])
|
||||||
|
|
||||||
# cdecl
|
# cdecl
|
||||||
C declaration
|
C declaration
|
||||||
|
|
||||||
- Originally from Microsoft's C compiler
|
- Originally from Microsoft's C [[compilers|compiler]]
|
||||||
- Used by many C compilers for x86
|
- Used by many C [[compilers]] for x86
|
||||||
- Subroutine arguments passed on the stack
|
- Subroutine arguments passed on the stack
|
||||||
- Function arguments pushed right-to-left
|
- Function arguments pushed right-to-left
|
||||||
- Last pushed first
|
- Last pushed first
|
||||||
|
@ -1,16 +1,16 @@
|
|||||||
## Machine Code
|
## Machine Code
|
||||||
- Machine language instructions
|
- Machine language instructions
|
||||||
- Directly control CPU
|
- Directly control [[Processors|CPU]]
|
||||||
- Strictly numerical
|
- Strictly numerical
|
||||||
- Lowest-level representation of a compiled or assembled program
|
- Lowest-level representation of a compiled or assembled program
|
||||||
- Lowest-level visible to programmer
|
- Lowest-level visible to programmer
|
||||||
- Internally microcode might used
|
- Internally microcode might used
|
||||||
- Hardware dependent
|
- Hardware dependent
|
||||||
- Higher-level languages translated to machine code
|
- Higher-level languages translated to machine code
|
||||||
- Compilers, assemblers and linkers
|
- [[Compilers]], assemblers and linkers
|
||||||
- Not for interpreted code
|
- Not for interpreted code
|
||||||
- Interpreter runs machine code
|
- Interpreter runs machine code
|
||||||
- Assembly is effectively human readable machine code
|
- [[Assembly]] is effectively human readable machine code
|
||||||
- Has mnemonics for opcodes etc
|
- Has mnemonics for opcodes etc
|
||||||
|
|
||||||
## Microcode
|
## Microcode
|
||||||
|
@ -24,5 +24,5 @@
|
|||||||
- Adobe Flash Player
|
- Adobe Flash Player
|
||||||
- Tamarin
|
- Tamarin
|
||||||
- JVM
|
- JVM
|
||||||
- LLVM
|
- [[Compilers#LLVM|LLVM]]
|
||||||
- Silverlight
|
- Silverlight
|
BIN
img/am-regulariser.png
Normal file
BIN
img/am-regulariser.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 352 KiB |
BIN
img/skip-connections.png
Normal file
BIN
img/skip-connections.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 51 KiB |
Loading…
Reference in New Issue
Block a user