2023-12-22 16:39:03 +00:00
|
|
|
---
|
|
|
|
tags:
|
|
|
|
- ai
|
|
|
|
---
|
2023-06-07 09:02:27 +01:00
|
|
|
# Supervised
|
|
|
|
- Dataset with inputs manually annotated for desired output
|
|
|
|
- Desired output = supervisory signal
|
|
|
|
- Manually annotated = ground truth
|
|
|
|
- Annotated correct categories
|
|
|
|
|
|
|
|
## Split data
|
|
|
|
- Training set
|
|
|
|
- Test set
|
|
|
|
***Don't test on training data***
|
|
|
|
|
|
|
|
## Top-K Accuracy
|
|
|
|
- Whether correct answer appears in the top-k results
|
|
|
|
|
|
|
|
## Confusion Matrix
|
|
|
|
Samples described by ***feature vector***
|
|
|
|
Dataset forms a matrix
|
|
|
|
![](../img/confusion-matrix.png)
|
|
|
|
|
|
|
|
# Un-Supervised
|
|
|
|
- No example outputs given, learns how to categorise
|
|
|
|
- No teacher or critic
|
|
|
|
|
|
|
|
## Harder
|
|
|
|
- Must identify relevant distinguishing features
|
|
|
|
- Must decide on number of categories
|
|
|
|
|
|
|
|
# Reinforcement Learning
|
|
|
|
- No teacher - critic instead
|
|
|
|
- Continued interaction with the environment
|
|
|
|
- Minimise a scalar performance index
|
|
|
|
|
|
|
|
![](../img/reinforcement-learning.png)
|
|
|
|
|
|
|
|
- Critic
|
|
|
|
- Converts primary reinforcement to heuristic reinforcement
|
|
|
|
- Both scalar inputs
|
|
|
|
- Delayed reinforcement
|
|
|
|
- System observes temporal sequence of stimuli
|
|
|
|
- Results in generation of heuristic reinforcement signal
|
|
|
|
- Minimise cost-to-go function
|
|
|
|
- Expectation of cumulative cost of actions taken over sequence of steps
|
|
|
|
- Instead of just immediate cost
|
|
|
|
- Earlier actions may have been good
|
|
|
|
- Identify and feedback to environment
|
|
|
|
- Closely related to dynamic programming
|
|
|
|
|
|
|
|
## Difficulties
|
|
|
|
- No teacher to provide desired response
|
|
|
|
- Must solve temporal credit assignment problem
|
|
|
|
- Need to know which actions were the good ones
|
|
|
|
|
|
|
|
# Fitting
|
|
|
|
- Over-fitting
|
|
|
|
- Classifier too specific to training set
|
|
|
|
- Can't adequately generalise
|
|
|
|
- Under-fitting
|
|
|
|
- Too general, not inferred enough detail
|
|
|
|
- Learns non-discriminative or non-desired pattern
|
|
|
|
|
|
|
|
# ROC
|
|
|
|
Receiver Operator Characteristic Curve
|
|
|
|
![](../img/receiver-operator-curve.png)
|