vault backup: 2023-05-31 22:21:56

Affected files:
.obsidian/global-search.json
.obsidian/workspace.json
Health/Alexithymia.md
Health/BWS.md
STEM/AI/Neural Networks/Activation Functions.md
STEM/AI/Neural Networks/Architectures.md
STEM/AI/Neural Networks/CNN/CNN.md
STEM/AI/Neural Networks/MLP/Back-Propagation.md
STEM/AI/Neural Networks/Transformers/Attention.md
STEM/CS/Calling Conventions.md
STEM/CS/Languages/Assembly.md
This commit is contained in:
andy 2023-05-31 22:21:56 +01:00
parent bfdc107e5d
commit 4cc2e79866
7 changed files with 22 additions and 21 deletions

View File

@ -11,7 +11,7 @@
- Bipolar
- -1 <-> +1
![[threshold-activation.png]]
![threshold-activation](../../img/threshold-activation.png)
# Sigmoid
- Logistic function
@ -26,7 +26,8 @@ $$\frac d {dx} \sigma(x)=
\right]
=\sigma(x)\cdot(1-\sigma(x))$$
![[sigmoid.png]]
![sigmoid](../../img/sigmoid.png)
### Derivative
$$y_j(n)=\varphi_j(v_j(n))=
@ -58,7 +59,7 @@ Rectilinear
- Sometimes small scalar for negative
- Leaky ReLu
![[relu.png]]
![relu](../../img/relu.png)
# SoftMax
- Output is per-class vector of likelihoods

View File

@ -2,7 +2,7 @@
- *Acyclic*
- Count output layer, no computation at input
![[feedforward.png]]
![feedforward](../../img/feedforward.png)
# Multilayer Feedforward
- Hidden layers
@ -12,12 +12,12 @@
- Fully connected
- Every neuron is connected to every neuron adjacent layers
- Below is a 10-4-2 network
![[multilayerfeedforward.png]]
![multilayerfeedforward](../../img/multilayerfeedforward.png)
# Recurrent
- At least one feedback loop
- Below has no self-feedback
![[recurrent.png]]
![[recurrentwithhn.png]]
![recurrent](../../img/recurrent.png)
![recurrentwithhn](../../img/recurrentwithhn.png)
- Above has hidden neurons

View File

@ -5,13 +5,13 @@
- Niche
- No-one cared/knew about CNNs
## After
- [[Datasets#ImageNet|ImageNet]]
- [ImageNet](../CV/Datasets.md#ImageNet)
- 16m images, 1000 classes
- GPUs
- General processing GPUs
- CUDA
- NIPS/ECCV 2012
- Double digit % gain on [[Datasets#ImageNet|ImageNet]] accuracy
- Double digit % gain on [ImageNet](../CV/Datasets.md#ImageNet) accuracy
# Full Connected
[[MLP|Dense]]

View File

@ -79,7 +79,7 @@ $$\Delta w_{ji}(n)=\eta\cdot\delta_j(n)\cdot y_i(n)$$
2. Error WRT output $y$
3. Output $y$ WRT Pre-activation function sum
4. Pre-activation function sum WRT weight
- Other [[Weight Init|weights]] constant, goes to zero
- Other [weights](../Weight%20Init.md) constant, goes to zero
- Leaves just $y_i$
- Collect 3 boxed terms as delta $j$
- Local gradient

View File

@ -10,16 +10,16 @@
- [LSTM](../RNN/LSTM.md) tends to poorly preserve far back [knowledge](../Neural%20Networks.md#Knowledge)
- Attention layer access all previous states and weighs according to learned measure of relevance
- Allows referring arbitrarily far back to relevant tokens
- Can be addd to [[RNN]]s
- In 2016, a new type of highly parallelisable _decomposable attention_ was successfully combined with a [[Architectures|feedforward]] network
- Attention useful in of itself, not just with [[RNN]]s
- [[Transformers]] use attention without recurrent connections
- Can be addd to [RNNs](../RNN/RNN.md)
- In 2016, a new type of highly parallelisable _decomposable attention_ was successfully combined with a [feedforward](../Architectures.md) network
- Attention useful in of itself, not just with [RNNs](../RNN/RNN.md)
- [Transformers](Transformers.md) use attention without recurrent connections
- Process all tokens simultaneously
- Calculate attention weights in successive layers
# Scaled Dot-Product
- Calculate attention weights between all tokens at once
- Learn 3 [[Weight Init|weight]] matrices
- Learn 3 [weight](../Weight%20Init.md) matrices
- Query
- $W_Q$
- Key

View File

@ -5,15 +5,15 @@
- Also known as: callee-saved registers or non-volatile registers
- How the task of preparing the stack for, and restoring after, a function call is divided between the caller and the callee
Subtle differences between [[compilers]], can be difficult to interface codes from different [[compilers]]
Subtle differences between [Compilers](Compilers.md), can be difficult to interface codes from different [compilers](Compilers.md)
Calling conventions, type representations, and name mangling are all part of what is known as an [application binary interface](https://en.wikipedia.org/wiki/Application_binary_interface) ([[ABI]])
Calling conventions, type representations, and name mangling are all part of what is known as an [application binary interface](https://en.wikipedia.org/wiki/Application_binary_interface) ([ABI](ABI.md))
# cdecl
C declaration
- Originally from Microsoft's C [[compilers|compiler]]
- Used by many C [[compilers]] for x86
- Originally from Microsoft's C [compiler](Compilers.md)
- Used by many C [compilers](Compilers.md) for x86
- Subroutine arguments passed on the stack
- Function arguments pushed right-to-left
- Last pushed first

View File

@ -1,11 +1,11 @@
[Uni of Virginia - x86 Assembly Guide](https://www.cs.virginia.edu/~evans/cs216/guides/x86.html)
## x86 32-bit
![[x86registers.png]]
![x86registers](../../img/x86registers.png)
## Stack
- push, pop, call, ret
![[stack.png]]
![stack](../../img/stack.png)
- Growing upwards