IoT-Labs/Coursework-Reports/report.lyx
2020-11-20 13:36:48 +00:00

380 lines
7.6 KiB
Plaintext

#LyX 2.3 created this file. For more info see http://www.lyx.org/
\lyxformat 544
\begin_document
\begin_header
\save_transient_properties true
\origin unavailable
\textclass article
\begin_preamble
\usepackage{color}
\definecolor{commentgreen}{RGB}{0,94,11}
\end_preamble
\use_default_options true
\begin_modules
customHeadersFooters
\end_modules
\maintain_unincluded_children false
\language english
\language_package default
\inputencoding auto
\fontencoding global
\font_roman "default" "default"
\font_sans "default" "default"
\font_typewriter "default" "default"
\font_math "auto" "auto"
\font_default_family default
\use_non_tex_fonts false
\font_sc false
\font_osf false
\font_sf_scale 100 100
\font_tt_scale 100 100
\use_microtype false
\use_dash_ligatures true
\graphics default
\default_output_format default
\output_sync 0
\bibtex_command default
\index_command default
\paperfontsize 10
\spacing single
\use_hyperref true
\pdf_title "IoT Aggregation Algorithm Coursework"
\pdf_author "Andy Pack"
\pdf_subject "IoT"
\pdf_bookmarks true
\pdf_bookmarksnumbered false
\pdf_bookmarksopen false
\pdf_bookmarksopenlevel 1
\pdf_breaklinks true
\pdf_pdfborder true
\pdf_colorlinks false
\pdf_backref false
\pdf_pdfusetitle true
\papersize default
\use_geometry true
\use_package amsmath 1
\use_package amssymb 1
\use_package cancel 1
\use_package esint 1
\use_package mathdots 1
\use_package mathtools 1
\use_package mhchem 1
\use_package stackrel 1
\use_package stmaryrd 1
\use_package undertilde 1
\cite_engine basic
\cite_engine_type default
\biblio_style plain
\use_bibtopic false
\use_indices false
\paperorientation portrait
\suppress_date false
\justification true
\use_refstyle 1
\use_minted 0
\index Index
\shortcut idx
\color #008000
\end_index
\leftmargin 1cm
\topmargin 1.5cm
\rightmargin 1cm
\bottommargin 1.5cm
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\paragraph_indentation default
\is_math_indent 0
\math_numbering_side default
\quotes_style english
\dynamic_quotes 0
\papercolumns 1
\papersides 1
\paperpagestyle fancy
\tracking_changes false
\output_changes false
\html_math_output 0
\html_css_as_file 0
\html_be_strict false
\end_header
\begin_body
\begin_layout Left Header
IoT Aggregation Algorithm Coursework
\end_layout
\begin_layout Left Footer
November 2020
\end_layout
\begin_layout Right Footer
Andy Pack / 6420013
\end_layout
\begin_layout Section
Description
\end_layout
\begin_layout Standard
Symbolic Aggregation Approximation (SAX) was implemented as an in-network
data processing technique, compressing the representation while allowing
further processing on this symbolic string.
Figure
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Demonstration-of-SAX"
plural "false"
caps "false"
noprefix "false"
\end_inset
shows two rounds of SAX output following data collection, a window size
of 2 was used and an alphabet of length 4, i.e the characters
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
a
\end_layout
\end_inset
through
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
d
\end_layout
\end_inset
inclusive.
12 C
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
floats
\end_layout
\end_inset
total 48 bytes of data, this can be reduced by a factor of 4 using
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
char
\end_layout
\end_inset
representation instead, a window size of 2 halves the number of output
samples and lowers the required memory to just 6 bytes.
\end_layout
\begin_layout Section
Specification
\end_layout
\begin_layout Standard
SAX is implemented in two separate steps, that of transforming the time-series
into Piecewise Aggregate Approximation (PAA) representation and then representi
ng this numeric series with a symbolic alphabet.
\end_layout
\begin_layout Subsection
PAA
\end_layout
\begin_layout Standard
The standard deviation and mean of the data series were first calculated,
these are required for Z-normalisation.
This normalisation process takes a series of data and transforms it into
one with a mean of 0 and a standard deviation of 1.
This changes the context of the value from being measured in lux to being
a measure of a samples distance from the mean, 0, in standard deviations.
This allows comparison of different time-series.
\end_layout
\begin_layout Standard
Following Z-normalisation, the size of the series is reduced by applying
a windowing function.
This takes subsequent equally-sized groups of samples and reduces the group
to the mean of those values.
\end_layout
\begin_layout Standard
As a result of these two actions, the original time series has been reduced
to a given length of samples with a mean of 0 and standard deviation of
1.
\end_layout
\begin_layout Subsection
SAX
\end_layout
\begin_layout Standard
With the result of the above, the remaining step is to replace each sample
value with a symbol to represent it.
The amount of symbols to be used is given, each will represent the same
probability range when considering a Gaussian distribution of mean 0 and
standard deviation of 1.
This can be achieved by using standard deviation breakpoints defined such
that the area under Gaussian curve between breakpoints is the same.
\end_layout
\begin_layout Section
Implementation
\end_layout
\begin_layout Standard
The SAX functionality was added as an alternative buffer rotating mechanism
over the original 12-to-1/4-to-1/12-to-12 aggregation system.
The length of the output buffer is calculated such that it can be allocated.
From here the input buffer is Z-normalised using the
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
normaliseBuffer(buffer)
\end_layout
\end_inset
function from the
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
buffer.h
\end_layout
\end_inset
header.
This function iterates over each value in the buffer, subtracts the buffer's
mean and then divides by the standard deviation.
Following this, the buffer is aggregated using the same 4-to-1 aggregation
function
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
aggregateBuffer(bufferIn, bufferOut, groupSize)
\end_layout
\end_inset
as the group size is variable.
The output from this function represents the PAA form of the initial data
series.
\end_layout
\begin_layout Standard
This final buffer is handled using
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
handleFinalBuffer(buffer)
\end_layout
\end_inset
where a pre-processor directive checks whether SAX is being used.
If so the PAA buffer is
\emph on
stringified
\emph default
using
\begin_inset listings
lstparams "basicstyle={\ttfamily}"
inline true
status open
\begin_layout Plain Layout
stringifyBuffer(buffer)
\end_layout
\end_inset
which performs the SAX symbolic representation.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\align center
\begin_inset Graphics
filename SaxBy2,4Break.png
width 30col%
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Demonstration of SAX aggregation with window size of 2 and alphabet of length
4
\begin_inset CommandInset label
LatexCommand label
name "fig:Demonstration-of-SAX"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\end_body
\end_document