As some of you may already know, Tribe of Pan is currently developing The Choice, a VR documentary experience featuring volumetric interviews of people speaking about their personal experiences with abortion. As one could imagine, it is a very emotionally charged topic. In order for our interview subjects to be treated with respect and our viewers to be able to feel real empathy, we had to develop our own proprietary capture system, the S3DD. We designed this new system to address three key requirements.
1. Six Degrees of Freedom & Bi-Directional Presence
Immersive headset technology is broken into two different classes of devices, 3 degrees of freedom, the ability to look in any direction, and 6 degrees of freedom, the ability to move in any direction. Being able to freely even just lean forward is what makes virtual reality so magical.
The movement of our body in real space, and seeing the virtual world move the same way, is what creates a sensation of presence that makes you truly believe that you are in a different reality. This is also true of objects and people you see in this virtual world. An example of the animated gif above, showing a 3 degrees of freedom video vs a 6 degrees of freedom one. When the user leans forward, the subject they are looking at stays fixed in their vision. This doesn't feel like a real person in front of you, but instead a screen attached to your face.
The example below it is of a 6 dof volumetric video. When the user moves naturally, the virtual subject reacts correctly, like they are a fixed object in the same space. This creates what we call Bi-Directional Presence where you both feel present in the virtual space, but also feel like the a virtual subject is present there with you too. This effect is important for the user to accept what they are looking at isn't just a video illusion, but a distinct and different entity that they can relate and empathize with.
This of course cannot be achieved with regular video. Even a flat polygon with a video playing on it would just feel like a cardboard cut-out talking to you. We also need to record the shape of our subjects with volumetric video.
2. Be Compact & Not Overwhelming
Many volumetric video capture systems currently on the market require specialized recording studios with upwards of a hundred cameras to capture all sides of the subject. These individual images are then processed, often with photogrammetry, into a series of textured meshes. These facilities can cost thousands to tens of thousands dollars to rent, and the person being recorded is enclosed in all sides.
This may be fine to experienced performers, but asking real people to be comfortable and act naturally in such an imposing situation can be nearly impossible. Our potential subjects are also scattered around North America, and there are a limited numbers cities that even have such dedicated volumetric studios.
It was key that our system be compact enough that we can travel anywhere in the world with it, and set up in environments like small portrait studios, conference rooms or even a bedroom (like in The Choice Kickstarter video). This allows us to be where our subject feels the most comfortable. We are focused entirely on what people are saying, and such the heavy infrastructure of multiple cameras to shoot the backs of people's heads isn't required.
3. Feel Alive & not uncanny or 'waxy'
One shortcoming of even the most detailed and high resolution volumetric capture solutions is the resulting virtual people feel uncanny, like they are made of wax or clay.
This is because while they are a three dimensional shape, each of your eyes is seeing the same texture and color with both of your eyes. Real things, especially faces, do not look that at all when we see them with our natural vision.
Reflections in the skin, highlights in the eyes, strands of hair, all of these things have very distinct differences, or disparities that help tell our brain about what we are looking at.
Looking at this animated image flipping between the left and right images shows how different surfaces appear differently, for example the soft cotton shirt is nearly identical, while the reflections in the eyes or highlights in the hair are subtly different. This difference is important for human faces to look natural and alive. It's a similar reason why many movies converted from 2D to 3D can feel unnatural to look at, where properly filmed 3D cinematography, like in IMAX, can feel so life-like.
From our background in 3D cinematography ourselves, we have developed a proprietary technology to combine the processes of volumetric video and stereographic cinematography to produce the most life-like feeling volumetric video capture possible today.