3D Visualisation of Medical Scan Images

div epub:type=”chapter” role=”doc-chapter”>

© Springer Nature Switzerland AG 2020

S. Stübinger et al. (eds.)Lasers in Oral and Maxillofacial Surgerydoi.org/10.1007/978-3-030-29604-9_16

16. Holographic 3D Visualisation of Medical Scan Images

Javid Khan1  
(1)

Holoxica Ltd, CodeBase, Argyle House, Edinburgh, UK
 
 
Javid Khan

Abstract

Following decades of research and development, three-dimensional (3D) holographic visualisation and display technologies are ready to emerge. A 3D image can be described in terms of capturing the light field of a scene, which can be recreated by a surface that emits rays of light as a function of both intensity and direction. This may be realised via integral imaging or holography or a combination of these. Holographic technology relies on lasers to create diffractive interference patterns that enable encoding of amplitude and phase information within an optical medium. This is in the form of transmission or reflection holograms that act as gratings to deflect light. Suitable illumination of these patterns can form a 3D representation of an object in free space. Printed digital reflection holograms with static 3D images are now sufficiently mature for the depiction of volumetric data from computed tomography, magnetic resonance imaging or ultrasound scans. The physiology of 3D visual image perception is introduced along with tangible benefits of 3D visualisation. Image processing and computer graphics techniques for medical scans are summarised. Next-generation holographic video displays for dynamic visualisation are on the horizon, which are also being designed for medical imaging modalities. Case studies are also presented in facial forensics and surgical planning.

Keywords

3DHolographyInterferenceDiffractionDisplayLight fieldHologramVisualisationCTMRIMedical imagingHoloxica

16.1 Introduction

The most advanced medical imaging technologies including CT, MRI and ultrasound have been pioneered over the past few decades by many scientists and engineers, winning Nobel Prizes in 1979 (Cormack and Hounsfield for CT) and 2003 (Lauterbur and Mansfield for MRI). These scanning machines are highly sophisticated and expensive devices, performing 3D scans by taking 2D slices through the body using ionising (X-rays for CT) and non-ionising radiation (magnetic fields/radio waves for MRI) or acoustic energy (USS). The pace of innovation has been tremendous in terms of size, safety, speed, accuracy and image resolution amongst other parameters. However, it is somewhat surprising to find that corresponding advances in three-dimensional displays for the visualisation of this data have not matched this rapid pace of scanner development.

The human visual perceptual system is inherently tuned to interpret the visual field in three spatial dimensions in addition to the temporal aspect. However, the conventional presentation of digital imaging information is largely constrained to just two spatial dimensions. For medical images, surgeons are often required to interpret individual 2D scan slices to build a 3D mental picture of the stack of such slices. This is counter-intuitive, even for a highly trained radiologist, and can lead to inaccurate or inconsistent interpretations. It becomes even more of a challenge to use this information for surgical planning, and explaining pathologies to patients can also be difficult. Most of these issues can be overcome with 3D visualisation techniques, which are seen as a grand challenge for display industry which has thus far fallen short of expectations.

16.2 Physiology of Human Visual Perception

The human visual system captures continuous images of the 3D world as a pair of two-dimensional images projected on to the back of the retina through the cornea, lenses and pupils of the eyes. The retina comprises photoreceptor cells including colour-sensitive rods and photosensitive cones that convert the incident photons into neural impulses that are relayed to the visual cortex of the brain via the optic nerve. The pair of 2D images is processed by the visual cortex, which utilises a combination of binocular and monocular information as so-called depth cues to recreate a 3D scene. The binocular depth cues are:

  • Stereopsis.

  • Vergence.

Stereopsis is responsible for binocular vision due to the lateral displacement of the eyes, in which the similarities and differences between a pair of offset 2D images are used to synthesise depth information about the scene. Vergence is the ability of both eyes to triangulate and focus on an object where the eyes converge on objects nearby or diverge on objects further away.

While binocular depth cues are the dominant means of 3D perception, there are a surprising number of monocular visual cues that enhance the experience [1]:

  • Accommodation.

  • Motion parallax.

  • Linear perspective.

  • Occlusion.

  • Familiar size and relative size.

  • Lighting and shadows.

  • Texture and shading.

Accommodation is the ability of the eye to focus on objects at different distances by changing the focal length of the lens via the muscles. Widening (and thus flattening) the lens increases the focal length and flattening causes it to be decreased. Using this mechanism, the eye can typically accommodate objects from about 7 cm to infinity in about 350 ms. Vergence and accommodation go hand-in-hand and any conflicts between the two lead to problems with 3D perception [24], which can be the case with conventional stereoscopic display implementations involving 3D glasses and headwear. Motion parallax is another related depth cue associated with the motion of the viewer across a scene where objects in the foreground appear to move more quickly than those in the background.

Figure 16.1 depicts a 2D image with a large number of 3D depth cues. Linear perspective is a geometric effect where parallel lines appear to triangulate towards a distant vanishing point located on the horizon. Occlusion of some objects by others provides depth cues about the relative distance of those objects, where background objects tend to be hidden or obscured by foreground objects. Known size implies that if we are familiar with the physical size of an object such as a person or a bicycle, then its perceived size in the visual field tells us how far away it is; linear size perceived in the visual field is inversely proportional to distance. Furthermore, the perceived relative size of two or more objects of known physical size informs the viewer regarding their relative distance. Finally, lighting, shading and texturing all enhance the perception of depth within a scene. Indeed, all of these factors are used to great effect in conveying 3D information in 2D imagery and artwork.

../images/435642_1_En_16_Chapter/435642_1_En_16_Fig1_HTML.png
Fig. 16.1

Depth cues including perspective, occlusion and shadows

Apart from the monocular and binocular depth cues presented here, a number of other factors are important for the creation of high-quality 2D as well as 3D display imagery including persistence of vision, colour, etc. which are tackled in other texts [5, 6]. For colour perception, there are three types of rods within the eye whose lengths correspond to spectral peaks, leading to the red, green and blue (RGB) colour model. Hence it is important to have laser and other light sources (e.g. LED) around these peak wavelengths for colour display systems.

16.2.1 Benefits of 3D Visualisation

A US Air Force Research Laboratory study discusses the relative benefits of 3D visualisation vs. 2D [7]. This is a comprehensive review that summarises the results of over 160 publications describing over 180 experiments spanning 51 years. The research covers human factors psychology, engineering, human–computer interaction, vision science, visualisation and medicine. The study concluded that 3D is overall 75% better than 2D for specific applications including spatial manipulation, finding, identifying or classifying objects [7]. The US Army Research Laboratory looked at training with medical hologram vs. traditional 2S methods [8]. This showed significant improvement in retention and reduced cognitive load for recalling complex 3D anatomy.

These features are important for the interpretation of medical imaging data, where specific benefits of 3D visualisation include:

  • Learning anatomy: increase retention and recall by over 20% [8].

  • Diagnostics: 40% faster interpretation of CT/MRI scans [9].

  • The speed of surgical procedures is increased by 15% [10].

  • Improve accuracy of surgery by up to 20% (incisions, stitching, navigation) [10].

There are clear advantages to viewing 3D information using 3D techniques, as quantified by these studies.

16.3 Light Field Synthesis

The scientific literature suggests that the best means of 3D reconstruction is to recreate the light field of an object or a scene [11, 12]. This can be achieved by holographic and similar approaches such as integral imaging. The concept of a light field describes the all rays of light for a scene as an array of vectors comprising the colour intensity I(λ), direction (θ, φ) and timing (t) of each ray of light, which can be described by the plenoptic function, L:

$$ L=f\left(I\left(\lambda \right),\theta, \varphi, t\right) $$
(16.1)

16.3.1 Integral Imaging

Integral imaging was first proposed over a 100 years ago by Lippmann [13], who won the 1908 Nobel Prize. Integral imaging is a method of 3D image formation method based on ray optics by integrating a set of rays emerging from 2D elemental images recorded from slightly different perspectives across the scene. The arrangement consists of a matrix of lenslets or pinholes located just behind a similar arrangement of miniature 2D photos. Each tiny lens views the 3D object from a slightly different perspective than a neighbouring lens, resulting in simultaneous reconstruction from an array of discrete perspectives. This enables the viewer to perceive a 3D representation of the object, depicted in Fig. 16.2. Here, the tiny photos are replaced by a display panel, which can recreate the array of smaller images. The 3D image resolution is determined by the size of the lenslet matrix, whereas the angular resolution or parallax depends on the diameter and pitch as well as the pixel density of the display panel.

../images/435642_1_En_16_Chapter/435642_1_En_16_Fig2_HTML.jpg
Fig. 16.2

Integra imaging with a lenslet array

Although the idea has been around for a long time, integral imaging has proven to be difficult to implement in practice. Capturing the 3D image and transferring it to a display panel is not trivial, involving lenslet arrays and camera sensors. Additional optical or image processing is required to convert between pseudoscopic and orthoscopic and virtual and real images. Pseudoscopic images are essentially inside-out (phase reversed) requiring conversion to orthoscopic viewpoint for presentation to the viewer. A virtual image is behind the screen, whereas a real image is in front of it, i.e. in mid-air. These issues have been researched extensively in an attempt to improve the performance of the technology using spatial and temporal multiplexing [14].

An increasingly popular approach is computational integral imaging that borrows techniques from 3D computer graphics. The advantage is that the object or scene can be generated completely synthetically from 3D model descriptions by the computer. The rendering of multiple 2D views of the 3D scene can be done using a ‘virtual’ camera that can pan around the object in any direction. These methods can leverage commodity graphics processing units (GPUs) to render the 3D views in hardware, described in Sect. 16.4.

16.3.2 Holography and Laser Interference

This section presents the basic principles and theoretical foundations of holograms and holography. The term hologram is derived from the Greek, meaning ‘whole image’. Holograms are able to encode both amplitude and phase information about the 3D object or scene. This information is encoded such that the light field can be reconstructed to give a true 3D representation corresponding to the original subject. This is in contrast with a photograph, which only stores amplitude information, thus yielding a 2D representation.

The foundations of holography were laid by Dennis Gabor, a Hungarian scientist working in the UK, in a patent [15] and a series of papers written between 1948 and 1951 that were aimed at microscopy [1618]. These introduced the notion of storing 3D information as a diffractive interference pattern that can subsequently be reconstructed through illumination. Holography remained somewhat obscure owing to its dependence on special coherent light sources. However, things changed dramatically after the invention of the laser in 1960 [19], which generates coherent light, leading to a revival of the field. Leith and Upatnieks reached a key milestone with the off-axis transmission hologram [20]. At about the same time, Denisyuk pioneered the reflection hologram [21] viewable using ordinary light rather than lasers. Dennis Gabor was awarded the Physics Nobel Prize in 1971.

The majority of holographic applications today can be found in security and authentication for credit cards, banknotes, passports or product packaging. These simple holograms are mass-produced using industrial-scale printing press-like machines. This is a billion-dollar industry, but which is currently not aimed at general 3D imaging as such.

16.3.2.1 Transmission and Reflection Holograms

Static holograms are formed by recording interference patterns in a photosensitive holographic material. Holograms use spatial variations in the absorption and/or the optical thickness to alter the amplitude and phase of an incident wave front. The modulation is due to interference patterns derived from the interaction of two laser beams: a reference beam and a beam reflected off an object.

There are two main types of holograms: transmission and reflection. Transmission holograms are formed by exposing the holographic material from the same side, whereas reflection holograms expose it from opposite sides. Transmission holograms are the simplest forms of holograms (Fig. 16.3) where a collimated laser beam is split in two, one path hitting the object which interferes with the other reference beam, thus exposing photographic plate. Details of practical recording considerations for transmission and reflection holograms (not shown) can be found in [22].

../images/435642_1_En_16_Chapter/435642_1_En_16_Fig3_HTML.png
Fig. 16.3

Recording a transmission hologram

Holographic recording media are usually based on organic photosensitive materials such as dichromated gelatine (DCG) or silver halide, similar to traditional photographic film [23]; modern materials are based on photopolymers with organic dyes [24]. DCG and silver halide require further chemical processing (bleaching and developing) after exposure, whereas the latest photopolymers use the light itself to develop the photosensitive material. Additionally, in some cases, bleaching requires illumination with UV radiation.

A holographic image is reconstructed by transmission or reflection of light through this recording medium. The replay of a transmission hologram is shown in Fig. 16.4, where the reference beam is replaced with a laser illumination beam.

../images/435642_1_En_16_Chapter/435642_1_En_16_Fig4_HTML.png
Fig. 16.4

Reconstructing a transmission hologram

Transmission holograms behave rather like gratings, requiring coherent illumination through the hologram, based on the principles of Fresnel diffraction from Fourier optics theory [25], Fig. 16.5 (left). The average fringe spacing, or spatial frequency of the interference patterns, within a transmission hologram is given by the grating equation:

$$ \Lambda =\frac{\sin \theta }{\lambda } $$
(16.2)

where θ is the angle between the object and the reference beam and λ is the wavelength of the recording laser. Reflection holograms can be regarded as tiny wavelength-selective micro-mirrors, requiring ordinary non-coherent illumination on the front surface, based on the principles of Bragg’s diffraction theory [26], Fig. 16.5 (right). The average spatial frequency of the Bragg grating structures within a reflection hologram is given by:

$$ \Lambda =\frac{2\;\sin \theta }{\lambda } $$
(16.3)

which is simply twice the spatial resolution of the transmission case. Reflection holograms require higher-resolution holographic media for recording, and they are also highly angular and wavelength sensitive, while transmissions generally diffract many wavelengths. For replay and viewing the 3D image, transmission holograms require laser (coherent) illumination, whereas reflection holograms can use incoherent ordinary white light sources including LEDs.

../images/435642_1_En_16_Chapter/435642_1_En_16_Fig5_HTML.png
Fig. 16.5

Diffraction through transmission (left) and reflection (right) holograms

The main characteristics of transmission and reflection holograms are shown in the table below:

Hologram type

Transmission

Reflection

Diffraction

Fresnel

Bragg

Recording lasers

Same side

Opposite sides

Resolution (lines/mm)

~2000

>5000

Transmission/reflection

Insensitive to λ and θ

Sensitive to λ and θ

Illumination

Laser (coherent)

Incoherent white light

Applications

Analogue mastering Holographic video display

Analogue/digital holograms

Only gold members can continue reading. Log In or Register to continue

Jul 22, 2021 | Posted by in Oral and Maxillofacial Surgery | Comments Off on 3D Visualisation of Medical Scan Images
Premium Wordpress Themes by UFO Themes