diff --git a/AI/Classification/Classification.md b/AI/Classification/Classification.md index 05c7b8f..85a2600 100644 --- a/AI/Classification/Classification.md +++ b/AI/Classification/Classification.md @@ -9,7 +9,7 @@ Argument that gives the maximum value from a target function # Gaussian Classifier -[Training](Supervised.md) +[Training](Supervised/Supervised.md) - Each class $i$ has it's own Gaussian $N_i=N(m_i,v_i)$ $$\hat i=\text{argmax}_i\left(p(o_t|N_i)\cdot P(N_i)\right)$$ diff --git a/AI/Classification/Decision Trees.md b/AI/Classification/Decision Trees.md new file mode 100644 index 0000000..58439fc --- /dev/null +++ b/AI/Classification/Decision Trees.md @@ -0,0 +1,4 @@ +- Flowchart like design +- Iterative decision making + +![](../../img/decision-tree.png) \ No newline at end of file diff --git a/AI/Classification/Gradient Boosting Machine.md b/AI/Classification/Gradient Boosting Machine.md new file mode 100644 index 0000000..ba03de5 --- /dev/null +++ b/AI/Classification/Gradient Boosting Machine.md @@ -0,0 +1,7 @@ +- Higher level take +- Iteratively train more models addressing weak points +- Well paired with decision trees + - Strictly outperform random forest most of the time + - Similar properties +- One of the best algorithm for dealing with non perceptual data +- XGBoost \ No newline at end of file diff --git a/AI/Classification/Logistic Regression.md b/AI/Classification/Logistic Regression.md new file mode 100644 index 0000000..e3a0323 --- /dev/null +++ b/AI/Classification/Logistic Regression.md @@ -0,0 +1,16 @@ +“hello world” +Related to naïve bayes + +- Statistical model +- Uses ***logistic function*** to model a ***categorical*** dependent variable + +# Types +- Binary + - 2 classes +- Multinomial + - Multiple classes without ordering + - Categories +- Ordinal + - Multiple ordered classes + - Star rating + diff --git a/AI/Classification/Random Forest.md b/AI/Classification/Random Forest.md new file mode 100644 index 0000000..f60077f --- /dev/null +++ b/AI/Classification/Random Forest.md @@ -0,0 +1 @@ +“Almost always the second best algorithm for any shallow ML task” \ No newline at end of file diff --git a/AI/Classification/Supervised.md b/AI/Classification/Supervised.md deleted file mode 100644 index 4deb7d3..0000000 --- a/AI/Classification/Supervised.md +++ /dev/null @@ -1,5 +0,0 @@ - -# Gaussian Classifier -- With $T$ labelled data - -$$q_t(i)=$$ \ No newline at end of file diff --git a/AI/Classification/Supervised/README.md b/AI/Classification/Supervised/README.md new file mode 120000 index 0000000..20b243f --- /dev/null +++ b/AI/Classification/Supervised/README.md @@ -0,0 +1 @@ +Supervised.md \ No newline at end of file diff --git a/AI/Classification/Supervised/SVM.md b/AI/Classification/Supervised/SVM.md new file mode 100644 index 0000000..8efccda --- /dev/null +++ b/AI/Classification/Supervised/SVM.md @@ -0,0 +1,74 @@ +[Towards Data Science: SVM](https://towardsdatascience.com/support-vector-machines-svm-c9ef22815589) +[Towards Data Science: SVM an overview](https://towardsdatascience.com/https-medium-com-pupalerushikesh-svm-f4b42800e989) + +- Dividing line between two classes + - Optimal hyperplane for a space + - Margin maximising hyperplane +- Can be used for + - Classification + - SVC + - Regression + - SVR +- Alternative to Eigenmodels for supervised classification +- For smaller datasets + - Hard to scale on larger sets + +![](../../../img/svm.png) +- Support vector points + - Closest points to the hyperplane + - Lines to hyperplane are support vectors + +- Maximise margin between classes +- Take dot product of test point with vector perpendicular to support vector +- Sign determines class + +# Pros +- Linear or non-linear discrimination +- Effective in higher dimensions +- Effective when number of features higher than training examples +- Best for when classes are separable +- Outliers have less impact + +# Cons +- Long time for larger datasets +- Doesn’t do well when overlapping +- Selecting appropriate kernel + +# Parameters +- C + - How smooth the decision boundary is + - Larger C makes more curvy + - ![](../../../img/svm-c.png) +- Gamma + - Controls area of influence for data points + - High gamma reduces influence of faraway points + +# Hyperplane + +$$\beta_0+\beta_1X_1+\beta_2X_2+\cdot\cdot\cdot+\beta_pX_p=0$$ +- $p$-dimensional space +- If $X$ satisfies equation + - On plane +- Maximal margin hyperplane +- Perpendicular distance from each observation to given plane + - Best plane has highest distance +- If support vector points shift + - Plane shifts + - Hyperplane only depends on the support vectors + - Rest don't matter + +![](../../../img/svm-optimal-plane.png) + +# Linearly Separable +- Not linearly separable +![](../../../img/svm-non-linear.png) +- Add another dimension + - $z=x^2+y^2$ +- Square of the distance of the point from the origin +![](../../../img/svm-non-linear-project.png) +- Now separable +- Let $z=k$ + - $k$ is a constant +- Project linear separator back to 2D + - Get circle +![](../../../img/svm-non-linear-separated.png) \ No newline at end of file diff --git a/AI/Classification/Supervised/Supervised.md b/AI/Classification/Supervised/Supervised.md new file mode 100644 index 0000000..5006d29 --- /dev/null +++ b/AI/Classification/Supervised/Supervised.md @@ -0,0 +1,23 @@ + +# Gaussian Classifier +- With $T$ labelled data +$$q_t(i)= +\begin{cases} + 1 & \text{if class } i \\ + 0 & \text{otherwise} +\end{cases}$$ +- Indicator function + +- Mean parameter +$$\hat m_i=\frac{\sum_tq_t(i)o_t}{\sum_tq_t(i)}$$ +- Variance parameter +$$\hat v_i=\frac{\sum_tq_t(i)(o_t-\hat m_i)^2}{\sum_tq_t(i)}$$ + +- Distribution weight + - Class prior + - $P(N_i)$ +$$\hat c_i=\frac 1 T \sum_tq_t(i)$$ + +$$\hat \mu_i=\frac{\sum_{t=1}^Tq_t(i)o_t}{\sum_{t=1}^Tq_t(i)}$$ +$$\hat\sum_i=\frac{\sum_{t=1}^Tq_t(i)(o_t-\mu_i)(o_t-\mu_i)^T}{\sum_{t=1}^Tq_t(i)}$$ +- For K-dimensional \ No newline at end of file diff --git a/AI/Learning.md b/AI/Learning.md new file mode 100644 index 0000000..1c90f36 --- /dev/null +++ b/AI/Learning.md @@ -0,0 +1,63 @@ +# Supervised +- Dataset with inputs manually annotated for desired output + - Desired output = supervisory signal + - Manually annotated = ground truth + - Annotated correct categories + +## Split data +- Training set +- Test set +***Don't test on training data*** + +## Top-K Accuracy +- Whether correct answer appears in the top-k results + +## Confusion Matrix +Samples described by ***feature vector*** +Dataset forms a matrix +![](../img/confusion-matrix.png) + +# Un-Supervised +- No example outputs given, learns how to categorise +- No teacher or critic + +## Harder +- Must identify relevant distinguishing features +- Must decide on number of categories + +# Reinforcement Learning +- No teacher - critic instead +- Continued interaction with the environment +- Minimise a scalar performance index + +![](../img/reinforcement-learning.png) + +- Critic + - Converts primary reinforcement to heuristic reinforcement + - Both scalar inputs +- Delayed reinforcement + - System observes temporal sequence of stimuli + - Results in generation of heuristic reinforcement signal +- Minimise cost-to-go function + - Expectation of cumulative cost of actions taken over sequence of steps + - Instead of just immediate cost + - Earlier actions may have been good + - Identify and feedback to environment +- Closely related to dynamic programming + +## Difficulties +- No teacher to provide desired response +- Must solve temporal credit assignment problem + - Need to know which actions were the good ones + +# Fitting +- Over-fitting + - Classifier too specific to training set + - Can't adequately generalise +- Under-fitting + - Too general, not inferred enough detail + - Learns non-discriminative or non-desired pattern + +# ROC +Receiver Operator Characteristic Curve +![](../img/receiver-operator-curve.png) \ No newline at end of file diff --git a/AI/Neural Networks/Learning/Boltzmann.md b/AI/Neural Networks/Learning/Boltzmann.md new file mode 100644 index 0000000..e904cb7 --- /dev/null +++ b/AI/Neural Networks/Learning/Boltzmann.md @@ -0,0 +1,30 @@ +- Stochastic +- Recurrent structure +- Binary operation (+/- 1) +- Energy function + +$$E=-\frac 1 2 \sum_j\sum_k w_{kj}x_kx_j$$ +- $j\neq k$ +- No self-feedback +- $x$ = neuron state +- Neurons randomly flip from $x$ to $-x$ + +$$P(x_k \rightarrow-x_k)=\frac 1 {1+e^{\frac{-\Delta E_k}{T}}}$$ + +- Energy change based on pseudo-temperature + - System will reach thermal equilibrium +- Delta E is the energy change resulting from the flip +- Visible and hidden neurons + - Visible act as interface between network and environment + - Hidden always operate freely + +# Operation Modes +- Clamped + - Visible neurons are clamped onto specific states determined by environment +- Free-running + - All neurons able to operate freely +- $\rho_{kj}^+$ = Correlation between states while clamped +- $\rho_{kj}^-$ = Correlation between states while free +- Both exist between +/- 1 + +$$\Delta w_{kj}=\eta(\rho_{kj}^+-\rho_{kj}^-), \space j\neq k$$ \ No newline at end of file diff --git a/AI/Neural Networks/Learning/Competitive Learning.md b/AI/Neural Networks/Learning/Competitive Learning.md new file mode 100644 index 0000000..ff4a93c --- /dev/null +++ b/AI/Neural Networks/Learning/Competitive Learning.md @@ -0,0 +1,40 @@ +- Only single output neuron fires + +1. Set of homogeneous neurons with some randomly distributed synaptic weights + - Respond differently to given set of input patterns +2. Limit imposed on strength of each neuron +3. Mechanism to allow neurons to compete for right to respond to a given subset of inputs + - Only one output neuron active at a time + - Or only one neuron per group + - ***Winner-takes-all neuron*** + +![](../../../img/comp-learning.png) + +- Lateral inhibition + - Neurons inhibit other neurons +- Winning neuron must have highest induced local field for given input pattern + - Winning neuron is squashed to 1 + - Others are clamped to 0 + +$$y_k= +\begin{cases} + 1 & \text{if } v_k > v_j \text{ for all } j,j\neq k \\ + 0 & \text{otherwise} +\end{cases} +$$ + +- Neuron has fixed amount of weight spread amongst input synapses + - Sums to 1 +- Learn by shifting weights from inactive to active input nodes + - Each input node relinquishes some proportion of weight + - Distributed amongst active nodes + +$$\Delta w_{kj}= +\begin{cases} + \eta(x_j-w_{kj}) & \text{if neuron $k$ wins the competition}\\ + 0 & \text{if neuron $k$ loses the competition} +\end{cases}$$ + +- Individual neurons learn to specialise on ensembles of similar patterns + - Feature detectors +![](../../../img/competitive-geometric.png) \ No newline at end of file diff --git a/AI/Neural Networks/Learning/Credit-Assignment Problem.md b/AI/Neural Networks/Learning/Credit-Assignment Problem.md new file mode 100644 index 0000000..4b9d3f2 --- /dev/null +++ b/AI/Neural Networks/Learning/Credit-Assignment Problem.md @@ -0,0 +1,17 @@ +- Assigning credit/blame for outcomes to each internal decision +- Loading Problem + - Loading a training set into the free parameters +- Important to any learning machine attempting to improve performance in situations involving temporally extended behaviour + +Two Sub-problems: +- ***Temporal*** credit-assignment problem + - Assigning credit for **outcomes** to **actions** + - Involves time when actions that deserve credit were taken + - Relevant when many actions taken and want to know which one was responsible +- ***Structural*** credit-assignment problem + - Assigning credit for **actions** to **internal decisions** + - Involves internal structures of actions generated by system + - Relevant for identifying which component should have behaviour altered + - By how much + +- Important in MLPs when there are many hidden neurons \ No newline at end of file diff --git a/AI/Neural Networks/Learning/Hebbian.md b/AI/Neural Networks/Learning/Hebbian.md new file mode 100644 index 0000000..73174d6 --- /dev/null +++ b/AI/Neural Networks/Learning/Hebbian.md @@ -0,0 +1,55 @@ +*Time-dependent, highly local, strongly interactive* + +- Oldest learning algorithm +- Increases synaptic efficiency as a function of the correlation between presynaptic and postsynaptic activities + +1. If two neurons on either side of a synapse are activated simultaneously/synchronously, then the strength of that synapse is selectively increased +2. If two neurons on either side of a synapse are activated asynchronously, then that synapse is selectively weakened or eliminated + +- Hebbian synapse + - Time-dependent + - Depends on times of pre/post-synaptic signals + - Local + - Interactive + - Depends on both sides of synapse + - True interaction between pre/post-synaptic signals + - Cannot make prediction from either one by itself + - Conjunctional or correlational + - Based on conjunction of pre/post-synaptic signals + - Conjunctional synapse +- Modification classifications + - Hebbian + - **Increases** strength with **positively** correlated pre/post-synaptic signals + - **Decreases** strength with **negatively** correlated pre/post-synaptic signals + - Anti-Hebbian + - **Decreases** strength with **positively** correlated pre/post-synaptic signals + - **Increases** strength with **negatively** correlated pre/post-synaptic signals + - Still Hebbian in nature, not in function + - Non-Hebbian + - Doesn't involve above correlations/time dependence etc + +# Mathematically +$$\Delta w_{kj}(n)=F\left(y_k(n),x_j(n)\right)$$ +- Generally +- All Hebbian + +![](../../../img/hebb-learning.png) + +## Hebb's Hypothesis +$$\Delta w_{kj}(n)=\eta y_k(n)x_j(n)$$ +- Activity product rule +- Exponential growth until saturation + - No information stored + - Selectivity lost + +## Covariance Hypothesis +$$\Delta w_{kj}(n)=\eta(x_j-\bar x)(y_k-\bar y)$$ +- Characterised by perturbation from of pre/post-synaptic signals from their mean over a given time interval +- Average $x$ and $y$ constitute thresholds +- Intercept at y = y bar +- Similar to learning in the hippocampus + +*Allows:* +1. Convergence to non-trivial state + - When x = x bar or y = y bar +2. Prediction of both synaptic potentiation and synaptic depression \ No newline at end of file diff --git a/AI/Neural Networks/Learning/Learning.md b/AI/Neural Networks/Learning/Learning.md new file mode 100644 index 0000000..aedcded --- /dev/null +++ b/AI/Neural Networks/Learning/Learning.md @@ -0,0 +1,5 @@ +*Learning is a process by which the free parameters of a neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place* + +1. The neural network is **stimulated** by an environment +2. The network undergoes **changes in its free parameters** as a result of this stimulation +3. The network **responds in a new way** to the environment as a result of the change in internal structure \ No newline at end of file diff --git a/AI/Neural Networks/Learning/README.md b/AI/Neural Networks/Learning/README.md new file mode 120000 index 0000000..e616bf8 --- /dev/null +++ b/AI/Neural Networks/Learning/README.md @@ -0,0 +1 @@ +Learning.md \ No newline at end of file diff --git a/AI/Neural Networks/RNN/Autoencoder.md b/AI/Neural Networks/RNN/Autoencoder.md new file mode 100644 index 0000000..e175e42 --- /dev/null +++ b/AI/Neural Networks/RNN/Autoencoder.md @@ -0,0 +1,10 @@ +- Sequence of strokes for sketching + - LSTM backbone + +![](../../../img/rnn+autoencoder.png) + +# Variational +- Learn mean and covariance to drive encoder stage + - Generate different outputs by sampling latent space + +![](../../../img/rnn+autoencoder-variational.png) \ No newline at end of file diff --git a/AI/Neural Networks/RNN/Deep Image Prior.md b/AI/Neural Networks/RNN/Deep Image Prior.md new file mode 100644 index 0000000..4f04a48 --- /dev/null +++ b/AI/Neural Networks/RNN/Deep Image Prior.md @@ -0,0 +1,8 @@ +- Overfitted to image + - Learn weights necessary to reconstruct from white noise +- Trained from scratch on single image + - Encodes prior for natural images + - De-noise images + +![](../../../img/deep-image-prior-arch.png) +![](../../../img/deep-image-prior-results.png) \ No newline at end of file diff --git a/AI/Neural Networks/RNN/MoCo.md b/AI/Neural Networks/RNN/MoCo.md new file mode 100644 index 0000000..62ef2b7 --- /dev/null +++ b/AI/Neural Networks/RNN/MoCo.md @@ -0,0 +1,13 @@ +- Similar to SimCLR +- Rich set of negatives + - Sampled from previous epochs in queue +- Two function for pos/neg and anchor + - Pos/neg are delayed anchor weights + - Updated with momentum +- Two delay mechanisms + - Two encoder functions + - Negative encoder queue + +![](../../../img/moco.png) + +$$\theta_k\leftarrow m\theta_k+(1-m)\theta_q$$ \ No newline at end of file diff --git a/AI/Neural Networks/RNN/Representation Learning.md b/AI/Neural Networks/RNN/Representation Learning.md new file mode 100644 index 0000000..4cc42a4 --- /dev/null +++ b/AI/Neural Networks/RNN/Representation Learning.md @@ -0,0 +1,13 @@ +# Unsupervised + +- Auto-encoder FCN +- Learns bottleneck (latent) representation + - Information rich + - $f(.)$ is CNN encoding function +![](../../../img/unsup-representation-learning.png) + +# Supervised +- Triplet loss + - Providing positive and negative requires supervision +- Two losses +![](../../../img/sup-representation-learning.png) \ No newline at end of file diff --git a/AI/Neural Networks/RNN/SimCLR.md b/AI/Neural Networks/RNN/SimCLR.md new file mode 100644 index 0000000..b5021b4 --- /dev/null +++ b/AI/Neural Networks/RNN/SimCLR.md @@ -0,0 +1,10 @@ +1. Data augmentation + - Crop patches from images in batch + - Add colour jitter +2. Within batch sample positive and negative + - Patches from same image are positive + - All other negative +3. MLP layer to compute loss instead of bottleneck embedding + - Head network for function of bottleneck + +![](../../../img/simclr.png) \ No newline at end of file diff --git a/img/comp-learning.png b/img/comp-learning.png new file mode 100644 index 0000000..2506414 Binary files /dev/null and b/img/comp-learning.png differ diff --git a/img/competitive-geometric.png b/img/competitive-geometric.png new file mode 100644 index 0000000..e2b4129 Binary files /dev/null and b/img/competitive-geometric.png differ diff --git a/img/confusion-matrix.png b/img/confusion-matrix.png new file mode 100644 index 0000000..bf55e6e Binary files /dev/null and b/img/confusion-matrix.png differ diff --git a/img/decision-tree.png b/img/decision-tree.png new file mode 100644 index 0000000..f045929 Binary files /dev/null and b/img/decision-tree.png differ diff --git a/img/deep-image-prior-arch.png b/img/deep-image-prior-arch.png new file mode 100644 index 0000000..2193c20 Binary files /dev/null and b/img/deep-image-prior-arch.png differ diff --git a/img/deep-image-prior-results.png b/img/deep-image-prior-results.png new file mode 100644 index 0000000..ec20d01 Binary files /dev/null and b/img/deep-image-prior-results.png differ diff --git a/img/hebb-learning.png b/img/hebb-learning.png new file mode 100644 index 0000000..fbfcef2 Binary files /dev/null and b/img/hebb-learning.png differ diff --git a/img/moco.png b/img/moco.png new file mode 100644 index 0000000..30b2abe Binary files /dev/null and b/img/moco.png differ diff --git a/img/receiver-operator-curve.png b/img/receiver-operator-curve.png new file mode 100644 index 0000000..8a82b1b Binary files /dev/null and b/img/receiver-operator-curve.png differ diff --git a/img/reinforcement-learning.png b/img/reinforcement-learning.png new file mode 100644 index 0000000..4ea3a3e Binary files /dev/null and b/img/reinforcement-learning.png differ diff --git a/img/rnn+autoencoder-variational.png b/img/rnn+autoencoder-variational.png new file mode 100644 index 0000000..98e14f1 Binary files /dev/null and b/img/rnn+autoencoder-variational.png differ diff --git a/img/rnn+autoencoder.png b/img/rnn+autoencoder.png new file mode 100644 index 0000000..38db9d1 Binary files /dev/null and b/img/rnn+autoencoder.png differ diff --git a/img/simclr.png b/img/simclr.png new file mode 100644 index 0000000..8193dd3 Binary files /dev/null and b/img/simclr.png differ diff --git a/img/sup-representation-learning.png b/img/sup-representation-learning.png new file mode 100644 index 0000000..cb3165d Binary files /dev/null and b/img/sup-representation-learning.png differ diff --git a/img/svm-c.png b/img/svm-c.png new file mode 100644 index 0000000..f204749 Binary files /dev/null and b/img/svm-c.png differ diff --git a/img/svm-non-linear-project.png b/img/svm-non-linear-project.png new file mode 100644 index 0000000..3d18f29 Binary files /dev/null and b/img/svm-non-linear-project.png differ diff --git a/img/svm-non-linear-separated.png b/img/svm-non-linear-separated.png new file mode 100644 index 0000000..8d6caad Binary files /dev/null and b/img/svm-non-linear-separated.png differ diff --git a/img/svm-non-linear.png b/img/svm-non-linear.png new file mode 100644 index 0000000..386ec08 Binary files /dev/null and b/img/svm-non-linear.png differ diff --git a/img/svm-optimal-plane.png b/img/svm-optimal-plane.png new file mode 100644 index 0000000..8008e90 Binary files /dev/null and b/img/svm-optimal-plane.png differ diff --git a/img/svm.png b/img/svm.png new file mode 100644 index 0000000..619e495 Binary files /dev/null and b/img/svm.png differ diff --git a/img/unsup-representation-learning.png b/img/unsup-representation-learning.png new file mode 100644 index 0000000..b69bf5d Binary files /dev/null and b/img/unsup-representation-learning.png differ