Empowering Ecosystems With Spatial Computing
A brief prototyping journey that explores how spatial computing devices of tomorrow can interface with the computing platforms of today.
employer: n/a
clients: n/a
role: xr rapid prototyper
published: october, 2022
Background
"How can XR naturally empower our daily lives such that we hardly notice it exists?"
For almost three years now I've had this very question tunneling deeper and deeper into my psyche after I accidentally asked it to myself aloud. Tunneling so far in fact that I find myself internally repeating this very question on an almost daily basis; fueling exploratory branches of thought to hopefully spark a potential path to a powerful idea. At this point it has essentially shaped the way I approach new problems when prototyping in the XR space.
One of those branches of thought that this question has provoked is AR/MR concepts revolving around current computing ecosystems and how MR & AR can act as an extension of these already widely adopted tools. Why only try to completely reinvent computing interfaces & platforms when we could also make the current ones even more powerful? Wouldn’t this only empower people and lead to more organic and frictionless adoption?
Now of course trying to create a prototype to conceptualize MR interactions of tomorrow can seem relatively daunting for a singular person like myself, given that there's not yet any public consumer-facing MR hardware that can directly interface with mobile devices on such core operating levels (Even the Quest Pro at the time of writing this). However, even the most recent consumer-facing XR hardware and software is still incredibly empowering if you mold it to work as you see fit.
For my latest personal prototype I’ve ended up using Oculus' Unreal Engine branch and Quest 2's hand-tracking 2.0 to create foundational systems that assist me in exploring passthrough MR concepts of the future. By virtually emulating existing touch-screen and computer hardware interfaces in VR, I have been able to start exploring the viability of a plethora of ideas surrounding the XR extension of modern mobile and desktop platforms. Of course current XR systems still have their limitations (as I'm sure a trained eye could spot in my video) but it's still far ahead of what anyone could have imagined not long ago, and it's empowering prototypers like myself to explore the previously unexplorable.
My prototype demo seen above is just the tip of a deep iceberg of ideas that have been churning internally for over a year to explore my formidable question. It's been liberating (albeit challenging) to spend the past few weeks finally starting to tackle a sliver of those ideas from the ground up when I've been able to find spare time.
I fully intend on continuing to explore these ideas and sharing them with fellow XR builders in hopes that this kindles even further constructive conversations around what we're all really trying to build here, why we're doing it, and how it will ultimately help society for the better.
The Hands
One of the first foundations that I needed to build upfront was the hand-tracked interaction system. I wanted to create a far more dynamic grabbing system that was more than just a simple proximity, joint distance, and pose-based approach seen throughout many hand-tracked systems. Seeing as I wanted to create my initial prototype in a reasonable span of time (approximately 3 weeks so that I could time this with Meta Connect 2022), I quickly ditched a physics-based approach and went with a completely dynamic capsule trace system. As incredibly rewarding and challenging as a physical hand system would be to build, it was simply out of scope for what I wanted to achieve.
To strike a balance, as seen in the video above, I'm actually calculating a capsule trace for every single joint transform of each hand for every frame, or at fixed intervals (for optimization). I'm then using the hit structs from the resulting traces to determine grip viability by comparing all opposing joint hit normals with the interacting object. If my system determines a viable grip, then it will read pose joint transforms from the interacting object and sequentially interpolate each hand joint to target transforms on the object, resulting in a smooth grab operation. Not only does this system work with picking objects up, but I've also integrated a touch-screen system to reliably interact with UMG components in world-space, as seen in the full-length demo video.
This is a more layman's overview of what's going on under the hood, but I think it helps reveal a little more of the technical workings without going overboard.
Here's a peek at one of the core functions that drives a portion of the capsule trace hand interaction system:
Two not-so-fun facts:
1. Capsule traces in Unreal don't support rotation out of the box, so thanks to Revilian's CapsuleTraceRotation plugin on the marketplace, I was able to get this approach to work.
2. Capsule traces in Unreal have a minimum draw radius/length for debugging, so when I was debugging my capsule traces I thought for the longest time that my finger tip capsule lengths were somehow being calculated incorrectly or that the traces were broken. Turns out the traces are only incorrect visually, but not functionally; my finger tip trace lengths were correct the entire time!
The Glasses
I knew before I started working on this that I wanted an eye-catching concept for the fictional AR/MR glasses that I would be using. I wanted to portray what the long-term future of MR could maybe look like and just how convenient, streamlined, and natural the ecosystem could be. At the end of the day, my goal is to inspire and drive conversation around ideas and provide an insight into what the future can hold for XR, not to abide by current hardware limitations.
Although I did end up modeling and texturing the glasses myself, I’d like to give credit to Taeyeon Kim for their concept of the glasses that I initially found on Design Boom. They may not be an accurate portrayal of modern capabilities, but damn… they’re sleek, and they fit the exact aesthetic that I had in mind.
I didn't want these glasses to just be a hollow centerpiece, I actually wanted them to have functionality - something that actually grounded my concept of natural convenience and mobile device continuity. To bring this to fruition I ended up creating a foundational Calibration and Bridging system to the glasses when putting them on.
We've likely all seen AR concepts of the "future" that display a vast array of virtual objects overlaid on top of reality, centralizing the data around us into a reality-obscuring adventure, in what some would actually consider a dystopian, data-saturated, and distracting future. To stay faithful to my big question, I wanted to create a glasses "operating system" that brought focus to our soil-based world and the objects the glasses interacted with, not to the glasses themselves. The glasses needed to become an afterthought after putting them on.
The Calibration stage seen in the video outputs a subtle sound to cue to the user that the glasses have activated, following with a quick and relatively subtle scanning of the environment. This sphere trace scanning system I created will literally radially scan the virtual environment for compatible devices and hands, and then initiate their Bridging process, essentially linking them to the glasses to unlock their underlying MR functionalities. The Bridging visual cue in my prototype is just a simple mesh & shader based loading bar that hovers above compatible devices until Bridging is complete; emitting a subtle chirp when finished, and closing back into the device.
Once Bridging is complete, the devices' MR capabilities are available and ready to use.
The Phone
Due to the core ideas I wanted to explore revolving around interacting with a physical phone, I immediately knew I needed to determine a way to create a virtual version of a mobile phone and the accompanying core OS interactions to compliment its newly unlocked XR interactions. This meant completely creating OS widgets and their animations/interactions from the ground up in Unreal's UMG, working in tandem with my dynamic hand interaction system so the phone could be operated as expected in the physical world. This allowed me to better determine whether my interactions could actually be viable should they actually be created in a hypothetical future.
On top of this I also wanted to supplement existing OS functionalities as XR counterparts, with the back-facing lock screen being one of those initial ideas, as seen below.
The initial thinking behind the back-facing lock screen came from the frustration of not wanting to physically pick up my phone to check for important notifications. Not out of laziness, but out of wanting to stay connected to my physical surroundings, my loved ones, or my work. Let's be honest, physically picking up our phones can easily lead us down a rabbit hole of doom scrolling a variety of apps out of pure habit rather than necessity.
And some might say that smart watches fill this gap, which could be a valid stance. Although from my experience smart watch notifications can easily be an intrusive distraction when getting unwanted haptic notifications, building a muscle memory habit to lift your wrist, removing you from your current focus. Not to mention that prolonged use can lead to phantom vibrations when not wearing the watch. As an interaction designer, prolonged physiological and psychological side-effects are something I want to minimize as much as possible.
The lockscreen that I landed on is a 100% mesh and shader based approach, allowing it to feed into my theme of groundedness and connectivity with reality while also providing infinitely crisp readability for future proofing purposes. While the lockscreen is awake for a brief moment, it only displays the date & time alongside the most recent apps with notifications. After 6 seconds it winds down by darkening the background and date-time, as well as displaying only the number of unread notifications; only to completely fade away when picking up the phone.
The entire lockscreen system was created using an entirely modular mesh-based approach with accompanying mesh text, dynamic skeletal mesh morph and shader animations controlled programmatically entirely in blueprint. This not only allowed for an incredibly crisp spatial interface, but also allows for an expandable system should it need to be modified in the future.
Fun fact: All of the clocks that you can spot in my prototype actually work, including the mesh ones! Besides being a seemingly ridiculous attention to detail, this actually helped me keep track of time while I was iterating.
When it came time to creating the phone's OS in UMG, I ended up painstakingly creating an atomic principled widget system for some of the OS' core functionalities with accompanying animations. All of these widgets essentially function just as their real-world counterparts would, requiring interaction with your fingers. This also goes for the gesture-based interactions found in the OS, such as unlocking and opening the App Switcher. None of these interactions are on rails as they're all programmatically controlled through quite a bit of complex transform math found both in the core OS widget and in my hand tracking interaction system.
This same logic also applied to the Kraken Pro app seen in my demo video. I ended up creating the basic user flow for placing an order in the Kraken Pro app all in UMG, including all of the animations and gesture-based interactions found in the mobile app.
Spatial Apps
One of the more profound ideas that sparked this journey for me was the idea that one could effortlessly drag apps out from their mobile phone and expand them as a more intuitive form of multi-tasking on the go. For my purposes I ended up landing on calling this feature Spatial Apps.
In my eyes this was one of the most exciting interactions that I honestly couldn't wait to start building out. Due to the aforementioned "emulation" of the mobile phone, this interaction meant that I was actually going to need to figure out how to create a fluid interaction that transitioned a 2D object from a screen into a spatially 3D one. Out of all of the interactions throughout this prototype, this one definitely ended up being the most challenging, albeit the most rewarding. It involved quite a lot of transform conversion, UMG workarounds, strategic interpolation, real-time skeletal mesh morph animation, and further integration with my hand interaction system to allow for this interaction to work so smoothly.
The first time I got the app seamlessly flowing out of the screen and into "physical" space with a simple finger swipe, I let out an audible giggle. This interaction may be one of my all-time favorites as it just feels so intuitive and fun.
To add on top of this base layer interaction of the app being pulled into 3D space, I also wanted the ability to expand the app into a wider format, so that I could see more information at a glance. I can't tell you how many times I ended up abandoning my phone in favor of a desktop/laptop app just so that I could better read a chart on TradingView to make a trading decision in Kraken, or compare longer form notes without switching between apps, or simply see something in a slightly larger format, just to name a few scenarios off of the top of my head. This interaction was also somewhat involved as it meant I needed to create a dynamic version of an app in UMG (TradingView in this case), and then animate it in real time as my hand physically pinched and dragged it into its new form.
With regards to future considerations with this Spatial App idea there are so many interesting avenues to take this. Questions certainly start to swirl around such as:
-
How many Spatial Apps can you spawn?
-
Should there be a dedicated Spatial Slot for Apps rather than multiple to prevent unwanted/accidental clutter?
-
If you can spawn multiple, but there's only a single slot, how do you manage multiple Spatial Apps?
-
How do you interact with the Spatial Apps with your hands?
-
Should you be able to interact with the Spatial Apps as a normal touch-based app or in a different pinch-to-select fashion?
-
Should the Spatial Apps feature work with all apps or only supported ones?
There are so many questions besides those above that this interaction has sparked for me and I'm really excited to explore some of my current ideas surrounding some of them.
The Laptop
Naturally, we can't talk about mobile platforms without mentioning one of those most widely adopted computing platforms; the laptop. This was unfortunately one of the portions of my prototype that I wasn't able to explore as deeply as I had originally hoped, but I still feel as though I've started a decent jumping-off point for myself to prototype further interaction ideas.
I think most people who have lived on their laptops for an extended period of time can relate to the occasional (or frequent like myself) desire to have multiple screens. I've even gone as far as looking into clunky fold-out monitor systems for laptops... needless to say that was abandoned quickly after realizing the ensuing headaches.
Some people will praise various multi-tasking OS features and claim that you can do everything you need to do on a single display. I would kindly tell those people "no, you cannot." As someone who works in Substance Painter, 3ds Max, Photoshop, Figma, Unreal Engine, Visual Studio, Brave, Slack, and more on a frequently simultaneous basis I can confirm that a single screen sucks for most creative workflows. If you work in Design, 3D, game-development, editing, or creative surfaces alike, chances are you echo similar sentiments. As I'm sure you could imagine, the idea to create virtual spatial displays was a natural evolution in my quest to empower modern computing hardware with XR.
For the interaction found in the demo, I didn't go as full-blown device emulation as I did with the phone, even though it is definitely something I want to explore further (seeing I have many ideas on how to intertwine them together). I ended up going with a relatively straightforward approach using UMG, global material parameter masking (to keep the virtual displays hidden), and runtime blueprint animation & scripting. My main focus was to display the natural interaction of opening a laptop, unlocking, and spreading its wings in the form of a curved virtual display on each side of the primary laptop display. Paired with a smooth animation of all of the open applications gliding from the single screen to their rightful places on their appropriate virtual displays.
I'll admit, the hand interaction in this portion didn't fit my bill of quality when it came to grabbing and opening the laptop's display. Due to the hand needing to attach itself to the laptop's hinged display, and the display rotation needing to follow the raw hand-tracking data, it got relatively tricky to use my dynamic hand grabbing system in this instance without some additional refactoring that I found probably wouldn't be worth implementing for such a short interaction in such a short amount of time. This would explain the slight clipping seen when grabbing the laptop initially.
Aside from that and the hand-tracking hiccup (causing the brief hand flicker) I think the rest of the interaction is executed pretty well and I found it feels like the natural extension of a laptop monitor that I was hoping for. I think just like the Spatial App, there are similar, if not more avenues that I could take this idea down. One of the big ones being, "how does this interface with the phone?". That question alone has taken my mind for quite a trip, as I hope it does for you.
Wrapping Up
First of all, I would like to thank you for making it this far into the ramblings of an XR prototyping madman. I glossed over quite a few technical details, and completely left out some others just for the pure sake of time and complexity. Even though I only spent a little under three weeks building out this project, the function libraries, interfaces, material systems, and various other scripting systems rapidly grew as I iterated on ideas and solved unforeseen hurdles at the speed of thought. I've had a blast going on this brief, creatively technical, and challenging journey (as I always do). As I mentioned earlier, I fully intend on continuing these explorations as I truly feel as though there is so much untapped potential here.
I'm always a proponent of trying to ask the right questions to solve problems. Questions are some of the most powerful tools that we as humans have at our disposal to look inwards and unravel entirely new chain reactions of fresh and powerful thought branches. Asking the right question could mean the difference between giving up on something and trying to champion it. So with that said I'd like to leave you with a few questions, in hopes that they provide you with more than what you came to this page with.
What brought you to this page, and what made you keep reading to this very point?
If you're a builder in the spatial computing industry, why are you here and what kind of change do you want spatial computing to make in the world?
How can we as spatial computing interaction designers prevent from creating interactions that leave negative, and long-lasting, psychological side effects to users?
What are some exciting ideas that you've thought of before but didn't execute on, and why?