Many years ago when broadband internet was still emerging, I spent an afternoon with a colleague in the company cafeteria trying to imagine a world with unlimited bandwidth and storage.
We imagined that distances would collapse when the location of data would no longer matter. Music and video would be instantly available and you could call up anything you wanted to hear or see and jump to any point in a pre-recorded piece. Video conferencing would allow teams to work together, regardless of location. You could build connectors between data and services and create new views and from that gain new insights.
Om Malik once proposed that broadband would serve as the railroads of our time. In the same way that the rail system in Europe and the interstate highway in the US mobilized industry and allowed remote communities to enjoy the output of industrialized centers, ubiquitous broadband would deliver the benefits of unlimited knowledge and ubiquitous reach to everyone around the world.
At Facebook’s F8 developer conference we heard details of several projects which combine to bring internet to everyone around the world including Aquila, a drone that flies at 60,000 feet to extend connectivity to remote regions and Terragraph and Project Aries teaching telecom companies how to improve connectivity in crowded urban areas.
We also learned about projects that are being built to explore what can be done with this increased connectivity. The screenshot above is from a Virtual Reality demo in which we saw two people in different locations share an experience in a 360 virtual world, taking a selfie and sharing that “photo” to Mike’s Facebook wall.
While the demo above is fantastic and paints a picture of what a shared virtual space might look like, it requires significant hardware and bandwidth to make happen. As people at Facebook like to say, this journey is only 1% finished.
Oculus Research’s Yaser Sheikh talk on Social Presence in Virtual Reality that came at the end of the Day 2 keynote (59 min. into the video above) really brought everything together. The reason Facebook needs better connectivity is because they do not want to stop at having two avatars playing around in a fixed image 360 photo.
To create a rich interaction where emotion and empathy can take place, we need to see all the subtle nuances that are expressed in the twitch of lip or roll of the eye. This is the unwritten language that we all know or what the anthropologist Edward Sapir called, “an elaborate code.”
There is something visceral about interacting with someone in a shared space. Yaser talked about the experience of his children in Pittsburg never really knowing his parents in India. To his kids, their grandparents are just, “moving images trapped behind a computer screen.” That is not how to build a lifelong relationship. Social VR aims to enable living and growing connections that are not a struggle to maintain.
There are three challenges to gaining a computational understanding of Sapir’s elaborate code.
- Capture – we need the ultimate motion capture of the whole body without being intrusive and in real-time. CMU’s Panoptic Studio is the state of the art but is still much too intrusive.
- Display – we need to transmit signals and animate avatars convincingly. The eyes, mouth, and hair are particular challenges.
- Prediction – we need and understanding of, “the vocabulary, the syntax, the morphology, and synchrony of social behavior” in order to write algorithms that help buffer social behaviors to overcome network latency (we all know how disruptive a bad connection can be to a video conference).
Facebook’s ambition is to reverse engineer this elaborate code. While digital video streams a live image captured by a camera, virtual reality will capture, store, and animate a digital representation of someone. Words spoken and gestures shown are broken apart and recombined.
Successfully building a prediction algorithm which can convincingly deliver requires an algorithm to continually anticipate state of mind and intent of others. This is much more than transmission of a moving image via bits – this is approaching the storage of the digital representation of what makes someone human. Building a library of all the possible human emotions and how to depict them is the ultimate moonshot and an appropriate one for a social network whose goal is to connect everyone. Stage one is capture and my sly take on the new Messaging Bot initiative is that all the conversations that are taking place on that platform are just step one in a big data harvesting program.
— ian kennedy (@iankennedy) April 14, 2016
Come full circle, back to that company cafeteria and imagine with me what a world would be like when Sapir’s elaborate code is cracked. When a digital avatar can be successfully animated we face some interesting questions.
What royalties do you pay when a movie studio uses the digital representation of George Clooney instead of the actor himself?
Can you simulate a debate between a virtual Donald Trump and a virtual Abraham Lincoln? If so, is it fair game to write about it and quote what Lincoln said?
After Mark Zuckerberg is gone, will his employees consult his virtual avatar for management decisions? Are his avatar’s decisions contractually binding?
Will a digital representation of someone understand humor? Sarcasm? What about a parody of recent events? Will tears well up as it tries not to cry?
The Black Mirror episode Be Right Back explores what our relationship might be to a digital avatar (in this case to lost loved one) and is well worth a look if you haven’t seen it. While advances in technology can make the barriers of distance and time melt away so that we can keep relationships thriving, we must remember that the virtual world can never replace the real one and that there can never be a substitute for a face-to-face conversation.