adding exp1/2 data, writing

2021-04-30 20:51:04 +01:00 · 2021-04-30 20:51:04 +01:00 · 06eb5ba2f9
commit 06eb5ba2f9
parent ee9aa2e644
50 changed files with 661 additions and 139 deletions
--- a/README.md
+++ b/README.md
@ -6,3 +6,25 @@ Evaluating a neural network using the MatLab `cancer_dataset`. Development conta
 2. Multiple classifier performance using majority vote
 3. Repeat 2 with two different optimisers (`trainlm`, `trainrp`)
 4. ***Extension***: Distinguish between two equi-probable classes of overlapping 2D Gaussians
 ![Image](graphs/exp1-test2-1-error-rate-curves.png)
 ## Timing
 ### exp 1
 CPU: 2min 36s ± 1.66 s per loop (mean ± std. dev. of 2 runs, 2 loops each)
 GPU: 3min 5s ± 2.95 s per loop (mean ± std. dev. of 2 runs, 2 loops each)
 ### exp 2
 CPU: 26 s ± 62.9 ms per loop (mean ± std. dev. of 2 runs, 2 loops each)
 GPU: 57.6 s ± 46.7 ms per loop (mean ± std. dev. of 2 runs, 2 loops each)
 ### exp 3
 CPU: 1min 19s ± 1.6 s per loop (mean ± std. dev. of 2 runs, 2 loops each)
 GPU: 3min 25s ± 280 ms per loop (mean ± std. dev. of 2 runs, 2 loops each)
--- a/graphs/exp1-test1-acc-surf.png
+++ b/graphs/exp1-test1-acc-surf.png
--- a/graphs/exp1-test1-error-rate-curves.png
+++ b/graphs/exp1-test1-error-rate-curves.png
--- a/graphs/exp1-test1-test-train-error-rate-std.png
+++ b/graphs/exp1-test1-test-train-error-rate-std.png
--- a/graphs/exp1-test1-test-train-error-rate.png
+++ b/graphs/exp1-test1-test-train-error-rate.png
--- a/graphs/exp1-test2-1-error-rate-curves.png
+++ b/graphs/exp1-test2-1-error-rate-curves.png
--- a/graphs/exp1-test2-1-test-train-error-rate-std.png
+++ b/graphs/exp1-test2-1-test-train-error-rate-std.png
--- a/graphs/exp1-test2-1-test-train-error-rate.png
+++ b/graphs/exp1-test2-1-test-train-error-rate.png
--- a/graphs/exp1-test2-2-error-rate-curves.png
+++ b/graphs/exp1-test2-2-error-rate-curves.png
--- a/graphs/exp1-test2-2-test-train-error-rate-std.png
+++ b/graphs/exp1-test2-2-test-train-error-rate-std.png
--- a/graphs/exp1-test2-2-test-train-error-rate.png
+++ b/graphs/exp1-test2-2-test-train-error-rate.png
--- a/graphs/exp1-test2-3-error-rate-curves.png
+++ b/graphs/exp1-test2-3-error-rate-curves.png
--- a/graphs/exp1-test2-3-test-train-error-rate-std.png
+++ b/graphs/exp1-test2-3-test-train-error-rate-std.png
--- a/graphs/exp1-test2-3-test-train-error-rate.png
+++ b/graphs/exp1-test2-3-test-train-error-rate.png
--- a/graphs/exp1-test5-acc-surf.png
+++ b/graphs/exp1-test5-acc-surf.png
--- a/graphs/exp1-test5-test-train-error-rate-std.png
+++ b/graphs/exp1-test5-test-train-error-rate-std.png
--- a/graphs/exp1-test5-test-train-error-rate.png
+++ b/graphs/exp1-test5-test-train-error-rate.png
--- a/graphs/exp1-test6-acc-surf.png
+++ b/graphs/exp1-test6-acc-surf.png
--- a/graphs/exp1-test6-test-train-error-rate-std.png
+++ b/graphs/exp1-test6-test-train-error-rate-std.png
--- a/graphs/exp1-test6-test-train-error-rate.png
+++ b/graphs/exp1-test6-test-train-error-rate.png
--- a/graphs/exp2-test10-error-rate-curves.png
+++ b/graphs/exp2-test10-error-rate-curves.png
--- a/graphs/exp2-test12-error-rate-curves.png
+++ b/graphs/exp2-test12-error-rate-curves.png
--- a/graphs/exp2-test13-error-rate-curves.png
+++ b/graphs/exp2-test13-error-rate-curves.png
--- a/graphs/exp2-test14-error-rate-curves.png
+++ b/graphs/exp2-test14-error-rate-curves.png
--- a/graphs/exp2-test15-error-rate-curves.png
+++ b/graphs/exp2-test15-error-rate-curves.png
--- a/graphs/exp2-test16-error-rate-curves.png
+++ b/graphs/exp2-test16-error-rate-curves.png
--- a/graphs/exp2-test17-error-rate-curves.png
+++ b/graphs/exp2-test17-error-rate-curves.png
--- a/graphs/exp2-test8-error-rate-curves.png
+++ b/graphs/exp2-test8-error-rate-curves.png
--- a/graphs/exp3-test1-error-rate-curves.png
+++ b/graphs/exp3-test1-error-rate-curves.png
--- a/graphs/exp3-test2-error-rate-curves.png
+++ b/graphs/exp3-test2-error-rate-curves.png
--- a/graphs/exp3-test3-error-rate-curves.png
+++ b/graphs/exp3-test3-error-rate-curves.png
--- a/graphs/exp3-test4-error-rate-curves.png
+++ b/graphs/exp3-test4-error-rate-curves.png
--- a/graphs/exp3-test5-error-rate-curves.png
+++ b/graphs/exp3-test5-error-rate-curves.png
--- a/graphs/exp3-test6-error-rate-curves.png
+++ b/graphs/exp3-test6-error-rate-curves.png
--- a/graphs/exp3-test7-error-rate-curves.png
+++ b/graphs/exp3-test7-error-rate-curves.png
--- a/graphs/exp3-test8-error-rate-curves.png
+++ b/graphs/exp3-test8-error-rate-curves.png
--- a/nncw.ipynb
+++ b/nncw.ipynb
--- a/report/report.lyx
+++ b/report/report.lyx
@ -404,7 +404,7 @@ noprefix "false"
 in conjunction.
 The effect of varying the number of nodes and epochs throughout the ensemble
 was considered in order to determine whether combining multiple models
- could produce a better accuracy than those individually.
+ could produce a better accuracy than any individual model.
 Section 
 \begin_inset CommandInset ref
 LatexCommand ref
@ -432,7 +432,7 @@ noprefix "false"
 \end_layout
 \begin_layout Section
-Hidden Nodes & Epochs (Exp 1)
+Hidden Nodes & Epochs
 \begin_inset CommandInset label
 LatexCommand label
 name "sec:exp1"
@ -443,21 +443,257 @@ name "sec:exp1"
 \end_layout
 \begin_layout Standard
-This section investigates the effect of varying the number of hidden nodes
+This section investigates the effect of varying the number of nodes in the
- in a single hidden layer of a multi-layer perceptron.
+ single hidden layer of a shallow multi-layer perceptron.
- This is compared to the effect of varying 
+ This is compared to the effect of training the model with different numbers
 of epochs.
 Throughout the experiment, stochastic gradient descent with momentum is
 used as the optimiser, variations in both momentum and learning rate are
 presented.
 \end_layout
 \begin_layout Subsection
 Results
 \end_layout
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
 status open
 \begin_layout Plain Layout
 \noindent
 \align center
 \begin_inset Graphics
 	filename ../graphs/exp1-test1-error-rate-curves.png
 	lyxscale 50
 	width 50col%
 \end_inset
 \end_layout
 \begin_layout Plain Layout
 \begin_inset Caption Standard
 \begin_layout Plain Layout
 Varied hidden node performance results over varied training lengths for
 \begin_inset Formula $\eta=0.01$
 \end_inset
 , 
 \begin_inset Formula $p=0$
 \end_inset
 \begin_inset CommandInset label
 LatexCommand label
 name "fig:exp1-test1"
 \end_inset
 \end_layout
 \end_inset
 \end_layout
 \begin_layout Plain Layout
 \end_layout
 \end_inset
 \end_layout
 \begin_layout Standard
 Figure 
 \begin_inset CommandInset ref
 LatexCommand ref
 reference "fig:exp1-test1"
 plural "false"
 caps "false"
 noprefix "false"
 \end_inset
 visualises the performance of hidden nodes up to 256 over training periods
 up to 200 epochs in length.
 In general, the error rate can be seen to decrease when the models are
 trained for longer.
 Increasing the number of nodes decreases the error rate and increases the
 gradient with which it falls up to a limit.
 64, 128 and 256 hidden nodes lie close together as the increases in performance
 slow.
 Between 0 and 25 epochs, the error rate throughout for any number of nodes
 can descend little below 0.35.
 The number of epochs to overcome this plateau is different for each number
 of nodes.
 \end_layout
 \begin_layout Standard
 The standard deviations for the above discussed results of figure 
 \begin_inset CommandInset ref
 LatexCommand ref
 reference "fig:exp1-test1"
 plural "false"
 caps "false"
 noprefix "false"
 \end_inset
 can be seen in figure 
 \begin_inset CommandInset ref
 LatexCommand ref
 reference "fig:exp1-test1-std"
 plural "false"
 caps "false"
 noprefix "false"
 \end_inset
 .
 As the network starts training, the standard deviation decreases to a minimum
 between 
 \begin_inset Formula $10-20$
 \end_inset
 epochs before increasing to a peak at 64.
 As the number of hidden nodes increases, the standard deviation decreases.
 The initial drop is sharper and the 64 epoch peak increases higher.
 \end_layout
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
 status open
 \begin_layout Plain Layout
 \noindent
 \align center
 \begin_inset Graphics
 	filename /mnt/files/dev/py/shallow-training/graphs/exp1-test1-test-train-error-rate-std.png
 	lyxscale 50
 	width 60col%
 \end_inset
 \end_layout
 \begin_layout Plain Layout
 \begin_inset Caption Standard
 \begin_layout Plain Layout
 Standard deviation of results from figure 
 \begin_inset CommandInset ref
 LatexCommand ref
 reference "fig:exp1-test1"
 plural "false"
 caps "false"
 noprefix "false"
 \end_inset
 with 
 \begin_inset Formula $\eta=0.01$
 \end_inset
 , 
 \begin_inset Formula $p=0$
 \end_inset
 \begin_inset CommandInset label
 LatexCommand label
 name "fig:exp1-test1-std"
 \end_inset
 \end_layout
 \end_inset
 \end_layout
 \begin_layout Plain Layout
 \end_layout
 \end_inset
 \end_layout
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
 status open
 \begin_layout Plain Layout
 \noindent
 \align center
 \begin_inset Graphics
 	filename /mnt/files/dev/py/shallow-training/graphs/exp1-test2-2-error-rate-curves.png
 	lyxscale 50
 	width 50col%
 \end_inset
 \end_layout
 \begin_layout Plain Layout
 \begin_inset Caption Standard
 \begin_layout Plain Layout
 Varied hidden node performance results over varied training lengths for
 \begin_inset Formula $\eta=0.1$
 \end_inset
 , 
 \begin_inset Formula $p=0$
 \end_inset
 \begin_inset CommandInset label
 LatexCommand label
 name "fig:exp1-test2-2"
 \end_inset
 \end_layout
 \end_inset
 \end_layout
 \end_inset
 \end_layout
 \begin_layout Subsection
 Discussion
 \end_layout
 \begin_layout Section
-Ensemble Classification (Exp 2)
+Ensemble Classification
 \begin_inset CommandInset label
 LatexCommand label
 name "sec:exp2"
@ -467,16 +703,239 @@ name "sec:exp2"
 \end_layout
 \begin_layout Standard
 A horizontal ensemble of 
 \begin_inset Formula $m$
 \end_inset
 models was constructed with majority vote in order to investigate whether
 this could improve performance over that of any single model.
 In order to introduce variation between models of the ensemble, a range
 for hidden nodes and epochs could be defined.
 When selecting parameters throughout the ensemble, the models are equally
 distributed throughout the ranges
 \begin_inset Foot
 status open
 \begin_layout Plain Layout
 For 
 \begin_inset Formula $m=1$
 \end_inset
 , the average of the range is taken
 \end_layout
 \end_inset
 .
 \end_layout
 \begin_layout Standard
 The statistic 
 \emph on
 agreement
 \emph default
 , 
 \begin_inset Formula $a$
 \end_inset
 , is defined as the proportion of models under the meta-classifier that
 correctly predict a sample's class when the ensemble correctly classifies.
 It could also be considered the confidence of the meta-classifier, for
 one horizontal model 
 \begin_inset Formula $a_{m=1}=1$
 \end_inset
 .
 As error rates are presented, this is inverted by 
 \begin_inset Formula $1-a$
 \end_inset
 to 
 \emph on
 disagreement
 \emph default
 , 
 \begin_inset Formula $d$
 \end_inset
 , the proportion of incorrect models when correctly group classifying.
 \end_layout
 \begin_layout Subsection
 Results
 \end_layout
 \begin_layout Standard
 For comparison, the average individual accuracy for both test and training
 data are presented.
 \end_layout
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
 status open
 \begin_layout Plain Layout
 \noindent
 \align center
 \begin_inset Graphics
 	filename ../graphs/exp2-test8-error-rate-curves.png
 	lyxscale 50
 	width 50col%
 \end_inset
 \end_layout
 \begin_layout Plain Layout
 \begin_inset Caption Standard
 \begin_layout Plain Layout
 Ensemble classifier performance results for 
 \begin_inset Formula $\eta=0.03$
 \end_inset
 , 
 \begin_inset Formula $p=0.01$
 \end_inset
 , nodes = 1 - 400, epochs = 5 - 100
 \begin_inset CommandInset label
 LatexCommand label
 name "fig:exp2-test8"
 \end_inset
 \end_layout
 \end_inset
 \end_layout
 \end_inset
 \end_layout
 \begin_layout Standard
 An experiment with a fixed epoch value throughout the ensemble is presented
 in figure 
 \begin_inset CommandInset ref
 LatexCommand ref
 reference "fig:exp2-test10"
 plural "false"
 caps "false"
 noprefix "false"
 \end_inset
 .
 Nodes between 1 and 400 were selected for the classifiers with a learning
 rate, 
 \begin_inset Formula $\eta=0.15$
 \end_inset
 and momentum, 
 \begin_inset Formula $p=0.01$
 \end_inset
 .
 The ensemble accuracy can be seen to be fairly constant throughout the
 number of horizontal models with 3 models being the least accurate with
 a higher standard deviation.
 3 horizontal models also shows a significant spike in disagreement and
 individual error rates which gradually decreases as the number of models
 increases.
 \end_layout
 \begin_layout Standard
 \begin_inset Float figure
 wide false
 sideways false
 status open
 \begin_layout Plain Layout
 \noindent
 \align center
 \begin_inset Graphics
 	filename ../graphs/exp2-test10-error-rate-curves.png
 	lyxscale 50
 	width 50col%
 \end_inset
 \end_layout
 \begin_layout Plain Layout
 \begin_inset Caption Standard
 \begin_layout Plain Layout
 Ensemble classifier performance results for 
 \begin_inset Formula $\eta=0.15$
 \end_inset
 , 
 \begin_inset Formula $p=0.01$
 \end_inset
 , nodes = 
 \begin_inset Formula $1-400$
 \end_inset
 , epochs = 20
 \begin_inset CommandInset label
 LatexCommand label
 name "fig:exp2-test10"
 \end_inset
 \end_layout
 \end_inset
 \end_layout
 \begin_layout Plain Layout
 \end_layout
 \end_inset
 \end_layout
 \begin_layout Subsection
 Discussion
 \end_layout
 \begin_layout Standard
 From the data of figure 
 \begin_inset CommandInset ref
 LatexCommand ref
 reference "fig:exp2-test10"
 plural "false"
 caps "false"
 noprefix "false"
 \end_inset
 , 3 horizontal models was shown to be the worst performing configuration
 with lower ensemble accuracy and higher disagreement.
 This is likely due to larger proportion that a single model constitutes.
 \end_layout
 \begin_layout Section
-Optimiser Comparisons (Exp 3)
+Optimiser Comparisons
 \begin_inset CommandInset label
 LatexCommand label
 name "sec:exp3"
@ -486,6 +945,20 @@ name "sec:exp3"
 \end_layout
 \begin_layout Standard
 Throughout the previous experiments the stochastic gradient descent optimiser
 was used to change the networks weights but there are many different optimisati
 on algorithms.
 This section will present investigations into two other optimisation algorithms
 and discuss the differences between them using the horizontal ensemble
 classification of the previous section.
 \end_layout
 \begin_layout Standard
 Prior to these investigations, however, stochastic gradient descent and
 the two other subject algorithms will be described.
 \end_layout
 \begin_layout Subsection
 Optimisers
 \end_layout
@ -510,10 +983,6 @@ Results
 Discussion
 \end_layout
 \begin_layout Section
 Overlapping 2D Gaussians (Exp 4)
 \end_layout
 \begin_layout Section
 Conclusions
 \end_layout
--- a/results/exp1-test2-1.p
+++ b/results/exp1-test2-1.p
--- a/results/exp1-test2-2.p
+++ b/results/exp1-test2-2.p
--- a/results/exp1-test2-3.p
+++ b/results/exp1-test2-3.p
--- a/results/exp1-test2-4.p
+++ b/results/exp1-test2-4.p
--- a/results/exp1-test2-5.p
+++ b/results/exp1-test2-5.p
--- a/results/exp1-test2-6.p
+++ b/results/exp1-test2-6.p
--- a/results/exp2-test12.p
+++ b/results/exp2-test12.p
--- a/results/exp2-test13.p
+++ b/results/exp2-test13.p
--- a/results/exp2-test14.p
+++ b/results/exp2-test14.p
--- a/results/exp2-test15.p
+++ b/results/exp2-test15.p
--- a/results/exp2-test16.p
+++ b/results/exp2-test16.p
--- a/results/exp2-test17.p
+++ b/results/exp2-test17.p