Optical Tracking Glossary
6DOF (Six Degrees Of Freedom)
A rigid body such as a drone or a person’s head can tilt or rotate along six different axes of movement. It’s easiest to imagine this when picturing an airplane’s various movements. In addition to moving left/right, forward/backward, and up/down, it may also rotate to the left or right (“yaw”), tilt up and down (“pitch”), or tilt to one side or the other (“roll”). These are the six degrees of freedom; without accurately tracking all six, we cannot navigate a drone or present the correct VR image to a person wearing a headset as he/she moves through a room.
Picture a lighthouse at sea, sending out a steady, predicable signal to guide ships; it’s a beacon, constantly announcing its position to all around, much easier to identify than natural landmarks. In an optical tracking system that lacks dedicated beacons, it is a struggle to find useful references in the visual field; they come up short when road markings are faded, hidden or damaged, or rooms aren’t heavily decorated with complex patterns and sharply defined objects. Like a lighthouse, however, a beacon stands out and can be instantly identified, regardless of the background.
DRAM (Dynamic Random-Access Memory)
Computer systems have a hierarchy of memory, with small, fast, efficient, and relatively expensive ‘static’ memory (SRAM) as at the heart of the system, and large, slow, less efficient and cheaper DRAM memories further out. With a single image typically too large to hold in SRAM, optical positioning systems generally compromise on resolution or pay the power and performance penalty of holding the captured images in DRAM.
DSP (Digital Signal Processor)
When identical mathematical operations have to be performed over a large set of data (for example, smoothing out the visual noise in a captured picture) a general-purpose CPU that works on processing one piece of data at a time is an inefficient choice. DSPs are specialized silicon designs optimized to perform mathematical operations on many pieces of data in parallel; however, they can be a costly addition to a system and difficult to program.
FPS (Frames per Second)
A moving image is captured as a series of individual pictures that can be played in sequence. Sometimes known as “refresh rate” or “frame rate”, this describes the number of individual pictures taken each second. Optical positioning systems generally run at 60FPS and lower, introducing at least 16 milliseconds of latency into the MTP value.
GPU (Graphics Processing Unit)
The stages of calculating the images for a computer or phone display is intensive and repetitive. Similar to a DSP, GPUs are specialized silicon designs designed to perform these operations efficiently. Many GPUs can also be used for related tasks such as image analysis. As with DSPs, they can be costly additions to a system and difficult to program.
LIDAR (LIght Detection And Ranging)
A range of technologies that determine the distance to an object by bouncing light off it and measuring the time taken for the light to travel to the object and return. Despite the advantages of this approach, LIDAR systems generally have limited resolution, and are often power-hungry.
MEMS (Micro Electro-Mechanical Systems)
Lithographic techniques are used to make the tiny circuits on silicon chips by projecting a tiny image onto a special coating that hardens with light, and then chemically etching away material not protected by the coating. Similar techniques can be used to make mechanical structures such as tuning forks far smaller than could be made with traditional cutting processes. MEMS techniques are key to the tiny sensors that provide high-speed gyroscope and accelerometer readings to a 6DOF measurement system.
One of the easiest ways of mitigating latency in a tracking system is to assume that moving objects have continued moving (or accelerating) at the same rate as they did prior to the measurements being taken. The system can therefore predict, based on data that is tens of milliseconds out of date, where the object is. This approach gives good results to a first approximation and can hide the real latency of the system, but it exacerbates the errors when sudden, unexpected movements happen, subtly contributing to VR nausea and adding delay to autonomous system response.
MTP (Motion To Photon latency)
The key metric in the responsiveness of a Virtual Reality system, this is the time taken by the VR system to first respond to any movement of a headset on a person’s head, and then update the image (“photons”) being presented to that person by the headset’s display. Ideal values are below 10 milliseconds, but most systems have longer response times, which can result in user nausea due to VOR.
SLAM (Simultaneous Location And Mapping)
When no map is available, a visual navigation system has to work out the position of the landmarks (such as beacons) around it, and also calculate its own position based on its observation of those landmarks as it moves. SLAM describes a broad range of techniques by which an accurate map is created and position data extrapolated by making predictions and adjusting the map based on observed errors in those predictions.
This is a technique for projecting – from a position separate from the observer – a pattern onto an object, to assist in determining the distance to that object. It is particularly useful when measuring the distance to a featureless object such as a blank wall. As with LIDAR, these systems generally have limited resolution and may be power-hungry.
VOR (Vestibulo-occular Reflex)
This is a completely natural action of the human body, whereby when somebody turns their head, their eyes automatically move in the opposite direction to keep their gaze fixed. In a VR system, unless tracking is exceptionally accurate and MTP is below 10 milliseconds, the VOR system is “disturbed” by the eyes momentarily seeing the old image at the new gaze point while the system works to create the new image. This mismatch between movement and image can trigger powerful nausea sensations in users, much in the same way that carsickness occurs when a passenger tries to read while a vehicle is moving.