added introduction and some hololens/android

This commit is contained in:
aj 2020-04-11 18:55:50 +01:00
parent bc1623d8cd
commit 5ea0202762
2 changed files with 269 additions and 83 deletions

View File

@ -267,7 +267,16 @@ Acknowledgements
\begin_layout Standard \begin_layout Standard
\noindent \noindent
\align center \align center
acknowledgements I'd like to extend my thanks to Professor Ning Wang for both the opportunities
provided by this project and his continued support.
\end_layout
\begin_layout Standard
\noindent
\align center
I would also like to thank Ioannis Selinis and Sweta Anmulwar for their
shared patience and help throughout this year.
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
@ -290,11 +299,29 @@ Introduction
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
\begin_inset Flex TODO Note (inline) The immersive technology spaces of virtual and augmented reality promise
to change the way we experience many different forms of media.
Spurred in recent years by the release of consumer VR headsets and the
push for handheld AR by phone manufacturers, no longer do these experiences
present merely proofs of concept but complete commercial products.
\end_layout
\begin_layout Standard
While some present natural extensions to existing technology as seen in
VR gaming, others are more tied to the new domain.
The power of modern smartphones has allowed augmented reality to be integrated
into applications both as the primary function and in more secondary features
such as visualising products in shopping apps like
\noun on
IKEA Place
\noun default
.
\begin_inset Flex TODO Note (Margin)
status open status open
\begin_layout Plain Layout \begin_layout Plain Layout
add proper intro about XR and new media reference?
\end_layout \end_layout
\end_inset \end_inset
@ -303,12 +330,29 @@ add proper intro about XR and new media
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
The aim of this project is to develop a piece of software capable of supporting No matter the application, common to all is the importance of the presented
multi-source holoportation (hologram teleportation) using the media.
\emph on Typically this is in the form of pre-recorded meshes of 3D objects, captured
and stored within the application.
This is typically the case for both of the previously mentioned examples
in VR gaming and AR object reconstruction.
Less seen in commercial products is the live-streaming of 3D renders.
\begin_inset Flex TODO Note (Margin)
status open
\begin_layout Plain Layout
why?
\end_layout
\end_inset
\end_layout
\begin_layout Standard
One such technology for the capture and transmission of 3D video is
\noun on \noun on
LiveScan3D LiveScan3D
\emph default
\noun default \noun default
\begin_inset CommandInset citation \begin_inset CommandInset citation
@ -318,18 +362,37 @@ literal "false"
\end_inset \end_inset
suite of software as a base. , a client-server suite of applications utilising the
\noun on
Microsoft Kinect
\noun default
camera to capture a scene in real-time and deliver to a server for reconstructi
on and presentation.
Renders or
\emph on
holograms
\emph default
can then be delivered to a
\emph on
user experience
\emph default
such as an AR or VR client.
\end_layout
\begin_layout Standard
This project aims to extend this suite to support multi-source
\emph on
holoportation
\emph default
(hologram teleportation), receiving multiple scenes concurrently analogous
to move from traditional phone calls to group conference calls.
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
As the spaces of augmented and virtual reality become more commonplace and As the spaces of augmented and virtual reality become more commonplace and
mature, the ability to capture and stream 3D renders of objects and people mature, the ability to capture and stream 3D renders over the internet
over the internet using consumer-grade hardware has many possible applications. using consumer-grade hardware has many possible applications and presents
\end_layout one of the most direct evolutions of traditional video streaming.
\begin_layout Standard
This represents one of the most direct evolutions of traditional video streaming
when applied to this new technological space.
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
@ -346,7 +409,7 @@ noprefix "false"
. .
Both single and multi-view configurations of cameras are shown, the latter Both single and multi-view configurations of cameras are shown, the latter
allowing more complete renders of the subject to be acquired. allowing more complete renders of the subject to be acquired.
Both shapes are presented through the Both shapes are presented at the
\emph on \emph on
user experience user experience
\emph default \emph default
@ -402,53 +465,45 @@ name "fig:premise"
\end_layout \end_layout
\begin_layout Standard \begin_layout Subsection
Objectives
\noun on
LiveScan3D
\noun default
is a suite of 3D video software capable of recording and transmitting video
from client to server for rendering.
The suite is fast and uses consumer grade hardware for capture in the form
of
\noun on
Xbox Kinect
\noun default
cameras, it is used in various projects at the
\noun on
University of Surrey
\noun default
and has multiple setups in dedicated lab space.
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
In order to achieve the goal of multi-source holoportation the following
\noun on key objectives must be achieved,
LiveScan3D's
\noun default
use
\noun on
\noun default
of
\noun on
Xbox Kinect
\noun default
cameras allows the capture and stream of 3D renders in single or multi-view
configurations using calibrated cameras however the server is only able
to process and reconstruct one environment at a time.
\end_layout \end_layout
\begin_layout Standard \begin_layout Enumerate
The capability to concurrently receive and reconstruct streams of different Extend the native viewfinder of the server in order to separately render
objects further broadens the landscape of possible applications, analogous each connected source
to the movement from traditional phone calls to conference calling. \end_layout
\begin_layout Enumerate
Redefine the network communications in order to identify each frame of footage
as a particular source
\end_layout
\begin_layout Enumerate
Update the Android AR application to present each source of footage and
facilitate individual control
\end_layout \end_layout
\begin_layout Subsection \begin_layout Subsection
COVID-19 COVID-19
\end_layout \end_layout
\begin_layout Standard
Conducted throughout the 2019/20 academic year the project was inevitably
affected by the global COVID-19 pandemic.
From March onwards, there was only access to a single
\noun on
Kinect
\noun default
sensor, significantly hindering the ability to quantitatively evaluate
the implemented multi-source capabilities.
\end_layout
\begin_layout Section \begin_layout Section
Literature Review Literature Review
\end_layout \end_layout
@ -462,7 +517,7 @@ LiveScan3D
\noun on \noun on
Microsoft Kinect Microsoft Kinect
\noun default \noun default
sensor in order to capture RGB video with depth information. sensor to capture RGB video with depth information.
While Kinect sensors have proved extremely popular in the computer vision While Kinect sensors have proved extremely popular in the computer vision
sector, it does not represent the only method for such 3D reconstruction, sector, it does not represent the only method for such 3D reconstruction,
traditional visual hull reconstruction is investigated before identifying traditional visual hull reconstruction is investigated before identifying
@ -632,7 +687,7 @@ Here 3D conference calling of the type described in the introduction without
space on a screen with all participants rendered within. space on a screen with all participants rendered within.
Work was undertaken to achieve mutual gaze between participants, a marked Work was undertaken to achieve mutual gaze between participants, a marked
advantage over traditional conference calls where the lack of such aspects advantage over traditional conference calls where the lack of such aspects
of group interaction make the experience more impersonal. of group interaction makes the experience more impersonal.
Methods of achieving more natural virtual interactions or Methods of achieving more natural virtual interactions or
\emph on \emph on
telepresence telepresence
@ -656,7 +711,7 @@ A second version of the camera, v2, was released alongside the
Xbox One Xbox One
\noun default \noun default
in 2013 and presented many improvements over the original. in 2013 and presented many improvements over the original.
A higher quality RGB camera captures 1080p video at up to 30 frames per A higher-quality RGB camera captures 1080p video at up to 30 frames per
second with a wider field of view than the original second with a wider field of view than the original
\begin_inset CommandInset citation \begin_inset CommandInset citation
LatexCommand cite LatexCommand cite
@ -854,9 +909,9 @@ literal "false"
\end_layout \end_layout
\begin_layout Description \begin_layout Description
Mixed A combination of virtual elements with the real world in order to Mixed A combination of virtual elements with the real world to facilitate
facilitate interaction with an augmented reality. interaction with an augmented reality.
A somewhat broad term owing to it's description of a point between augmented A somewhat broad term owing to its description of a point between augmented
and virtual reality. and virtual reality.
An emphasis is typically placed on virtual elements existing coherently An emphasis is typically placed on virtual elements existing coherently
within the real world and interacting in real-time. within the real world and interacting in real-time.
@ -947,7 +1002,7 @@ Human interaction with a machine in which the machine retains and manipulates
\begin_layout Standard \begin_layout Standard
Identifying the common dimensions across XR has led to the proposal of various Identifying the common dimensions across XR has led to the proposal of various
taxonomies providing insights in to how each implementation relate to others taxonomies providing insights into how each implementation relate to others
\begin_inset CommandInset citation \begin_inset CommandInset citation
LatexCommand cite LatexCommand cite
key "reality-virtuality-continuum,mr-taxonomy,all-reality" key "reality-virtuality-continuum,mr-taxonomy,all-reality"
@ -1007,9 +1062,9 @@ RoomAlive
procams procams
\emph default \emph default
) to construct experiences in any room. ) to construct experiences in any room.
This is presented through games and visual alterations to the users surrounding This is presented through games and visual alterations to the user's surroundin
s. gs.
A strength of the system is it's self contained nature, able to automatically A strength of the system is it's self-contained nature, able to automatically
calibrate the camera arrangements using correspondences found between each calibrate the camera arrangements using correspondences found between each
view. view.
Experience level heuristics are also discussed regarding capturing and Experience level heuristics are also discussed regarding capturing and
@ -1124,7 +1179,7 @@ Augmented Reality
The advancement of mobile AR aided by it's accessibility without expensive The advancement of mobile AR aided by it's accessibility without expensive
ancillary hardware has led this to be a rapidly growing and popular form ancillary hardware has led this to be a rapidly growing and popular form
of XR. of XR.
The introduction of OS level SDK's in Google's ARCore The introduction of OS-level SDK's in Google's ARCore
\begin_inset CommandInset citation \begin_inset CommandInset citation
LatexCommand cite LatexCommand cite
key "ARCore" key "ARCore"
@ -2055,8 +2110,8 @@ extrinsics
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
In order to make a composite frame a calibration process is completed client In order to make a composite frame a calibration process is completed client-sid
side following instruction by the server. e following instruction by the server.
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
@ -2125,7 +2180,7 @@ name "fig:calibration-marker"
\begin_layout Standard \begin_layout Standard
This information can be used to transform points from the cameras coordinate This information can be used to transform points from the cameras coordinate
system to the markers frame of reference. system to the marker's frame of reference.
As the relative locations of different markers are defined at the server, As the relative locations of different markers are defined at the server,
a world coordinate system can be defined as the centre of these markers. a world coordinate system can be defined as the centre of these markers.
Typically 4 different markers are placed on the faces around the vertical Typically 4 different markers are placed on the faces around the vertical
@ -2142,13 +2197,13 @@ Kinect
camera now orbits. camera now orbits.
As part of this calibration process the server distributes transformations As part of this calibration process the server distributes transformations
to each client defining where they sit within this world coordinate space. to each client defining where they sit within this world coordinate space.
Client's can now transform acquired renders from their own frame of reference Client's can now transform acquired renders from their frame of reference
to the world coordinate system at the point of capture and each point cloud to the world coordinate system at the point of capture and each point cloud
can be merged coherently. can be merged coherently.
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
The refinement process is completed server side by requesting a single frame The refinement process is completed server-side by requesting a single frame
from each connected client and using Iterative Closest Points from each connected client and using Iterative Closest Points
\begin_inset CommandInset citation \begin_inset CommandInset citation
LatexCommand cite LatexCommand cite
@ -2171,6 +2226,13 @@ OpenGL
\begin_layout Subsection \begin_layout Subsection
Buffers and a non-blocking Network Buffers and a non-blocking Network
\begin_inset CommandInset label
LatexCommand label
name "subsec:Buffers"
\end_inset
\end_layout \end_layout
\begin_layout Subsection \begin_layout Subsection
@ -2181,6 +2243,40 @@ LiveScan
Hololens Hololens
\end_layout \end_layout
\begin_layout Standard
Developed at the start of 2017, the
\noun on
LiveScan Hololens
\noun default
\begin_inset CommandInset citation
LatexCommand cite
key "livescan3d-hololens"
literal "false"
\end_inset
project provides an example of XR applications for the
\noun on
LiveScan
\noun default
suite.
\end_layout
\begin_layout Standard
Using the
\noun on
Unity
\noun default
game engine, the application allows the reception and reconstruction of
\noun on
LiveScan
\noun default
point clouds in a head-mounted AR context.
\end_layout
\begin_layout Subsection \begin_layout Subsection
\noun on \noun on
@ -2189,6 +2285,89 @@ LiveScan
Android Android
\end_layout \end_layout
\begin_layout Standard
Utilising a key strength of
\noun on
Unity
\noun default
's native cross-platform capabilities
\begin_inset CommandInset citation
LatexCommand citeauthor
key "livescan3d-android"
literal "false"
\end_inset
\begin_inset CommandInset citation
LatexCommand cite
key "livescan3d-android"
literal "false"
\end_inset
extended the
\noun on
Hololens
\noun default
application to build a handheld AR experience targeted at the Android mobile
operating system.
\end_layout
\begin_layout Standard
This was achieved through the static linking of
\noun on
Google
\noun default
's
\noun on
ARCore
\noun default
library
\begin_inset CommandInset citation
LatexCommand cite
key "arcore-unity"
literal "false"
\end_inset
for
\noun on
Unity
\noun default
and included buffers inline with the developments discussed in section
\begin_inset CommandInset ref
LatexCommand ref
reference "subsec:Buffers"
plural "false"
caps "false"
noprefix "false"
\end_inset
.
The
\noun on
LeanTouch
\noun default
\begin_inset CommandInset citation
LatexCommand cite
key "lean-touch"
literal "false"
\end_inset
\noun on
Unity
\noun default
package was used to allow the live manipulation of displayed holograms,
abstracting away much of the otherwise required boilerplate touch management
code.
\end_layout
\begin_layout Subsection \begin_layout Subsection
Evaluation Evaluation
\end_layout \end_layout
@ -2199,7 +2378,7 @@ Here an evaluation of the
LiveScan LiveScan
\noun default \noun default
suite is presented. suite is presented.
It's strengths within the space of 3D capture and transmission are identified Its strengths within the space of 3D capture and transmission are identified
while it's limitations are also highlighted. while it's limitations are also highlighted.
\end_layout \end_layout
@ -2208,8 +2387,8 @@ The main strength of the
\noun on \noun on
LiveScan LiveScan
\noun default \noun default
suite lies in it's display agnostic architecture. suite lies in its display agnostic architecture.
While some of the methods previously reviewed present domain specific implement While some of the methods previously reviewed present domain-specific implement
ations of holoportation, such as ations of holoportation, such as
\begin_inset CommandInset citation \begin_inset CommandInset citation
LatexCommand cite LatexCommand cite
@ -2252,7 +2431,7 @@ reference?
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
A limitation of the suite could be identified in it's network protocol use, A limitation of the suite could be identified in its network protocol use,
TCP connections are used throughout the communication pipeline from TCP connections are used throughout the communication pipeline from
\noun on \noun on
Kinect Kinect
@ -2263,7 +2442,7 @@ Kinect
which limits the speed of transmission. which limits the speed of transmission.
For these reasons UDP is typically better suited for media streaming, especiall For these reasons UDP is typically better suited for media streaming, especiall
y when in real-time. y when in real-time.
Investigations could be made into the suitability for it's use in the Investigations could be made into the suitability for its use in the
\noun on \noun on
LiveScan3D LiveScan3D
\noun default \noun default
@ -2325,8 +2504,8 @@ Developments
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
The developments made throughout this project in order to facilitate multi-sourc The developments made throughout this project to facilitate multi-source
e experiences were focused on four aspects of the suite, experiences were focused on four aspects of the suite,
\end_layout \end_layout
\begin_layout Itemize \begin_layout Itemize
@ -2368,7 +2547,7 @@ Additional features facilitating the paradigm shift away from a single stream
\begin_deeper \begin_deeper
\begin_layout Itemize \begin_layout Itemize
This is done through identifying and removing This is done by identifying and removing
\emph on \emph on
stale sources stale sources
\emph default \emph default
@ -2722,12 +2901,12 @@ LiveScan3D
\begin_layout Standard \begin_layout Standard
To accomplish this a dictionary was used as the shared variable with each To accomplish this a dictionary was used as the shared variable with each
client's frame referenced by it's source ID. client's frame referenced by its source ID.
In doing so only one frame per client is kept and each new frame overrides In doing so only one frame per client is kept and each new frame overrides
the last. the last.
During rendering the dictionary is iterated through and each point cloud During rendering the dictionary is iterated through and each point cloud
combined. combined.
During combination a client specific transformation is retrieved from an During combination a client-specific transformation is retrieved from an
instance of the instance of the
\noun on \noun on
DisplayFrameTransformer DisplayFrameTransformer
@ -2861,12 +3040,11 @@ LiveScan
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
Additionally there are utility functions to bidirectionally cast between Additionally, utility functions to bidirectionally cast between
\noun on \noun on
Point3f Point3f
\noun default \noun default
data structures and the lists of received vertices. data structures and the lists of received vertices were written.
\end_layout \end_layout
\begin_layout Standard \begin_layout Standard
@ -2975,7 +3153,7 @@ Transformer
\begin_inset Formula $y$ \begin_inset Formula $y$
\end_inset \end_inset
axis for each source, -axis for each source,
\begin_inset Formula $n$ \begin_inset Formula $n$
\end_inset \end_inset
@ -3618,7 +3796,7 @@ Frame Rate Throttling
\end_layout \end_layout
\begin_layout Subsection \begin_layout Subsection
Cross Platform Extensions Cross-Platform Extensions
\end_layout \end_layout
\begin_layout Subsubsection \begin_layout Subsubsection

View File

@ -411,3 +411,11 @@
year = {2017} year = {2017}
} }
@online{lean-touch,
author = {Wilkes, Carlos},
date = {2019-08-01},
title = {Lean Touch},
url = {https://carloswilkes.com/#LeanTouch},
urldate = {2020-04-11}
}