Through iOS 16 APIs, Apple Lays the Foundation For Mixed Reality Development

Without saying that word, Apple is preparing developers to build apps for its much-awaited AR/VR device

Anupam Chugh
Better Programming

--

Source: Apple

This article was originally posted on my Substack, iOSDevie

When Apple’s WWDC 2022 keynote kickstarted, the world eagerly awaited some announcements regarding the much-talked-about mixed reality headset.

At the very least, a toolkit analogous to M1’s DTK for developers was expected. More so, since there were multiple rumors about a potential announcement of realityOS. But yet another WWDC event folded without any mention of Apple’s most ambitious project.

What’s even more puzzling? The RealityKit and SceneKit framework barely got any updates this year. Instead, we were greeted with M2-powered Macs, a stage manager in iPadOS, a revamped iOS, and a significant upgrade in Carplay.

The mixed reality headset, which was once expected to be released in 2020, and was eventually pushed back to 2023, is only further delayed now — with a 2024 release on the cards.

In Apple’s defense, the cautious progress is understandable. For their nascent product to garner mass adoption, it needs a tighter integration within their ecosystem and also gets developers excited to build for the metaverse.

Kudos to their strides in continuity, today Apple’s ecosystem is more unified than ever before. I’d be remiss not to mention the new iOS 16 feature that lets you use your iPhone as a webcam for the Mac (looks like it's a beta test of how the Apple headset might operate alongside the iPhone).

At the same time, despite no news of realityOS development, the iPhone maker has been making significant enhancements in its APIs and frameworks to shape up the developers for a mixed reality future.

Let’s look at some new APIs announced during WWDC 2022. Some of these are well-known and received a lot of limelight during WWDC 22. However, from an AR/VR development standpoint, the role these APIs would play wasn’t that visible during the event.

Live Text API And Upgraded PDFKit for Scanning Text from Media

With iOS 15, Apple had introduced a Live Text feature to extract text from images. In iOS 16, they’ve taken things up a notch further by releasing a Live Text API to easily grab text from images and video frames. Released as a part of VisionKit framework, DataScannerViewController class lets you configure various parameters for scanning. Under the hood, the Live Text API uses the VNRecognizeTextRequest to detect texts.

At a first glance, the Live Text API feature seems like a Google Lens on steroids. However, just think of the possibilities it’ll bring when Apple’s next big gadget is in front of your eyes. For starters, imagine turning your head to quickly extract information with your eyes. Yup, it was already possible in iOS 15 through the AirPods spatial awareness for head-tracking that leverages CMHeadphoneMotionManager. Now throw iOS 16’s new personalized spatial audio into the mix and I can already see VR mechanics unfolding.

Likewise, two enhancements in the PDFKit framework — the ability to parse text fields and convert document pages into images — will matter a lot in building a rich AR lens experience.

Source: WWDC 2022 video — PDFKit

To ensure Apple’s mixed-reality device isn’t just a smartwatch on your face, providing a toolset to interact with text, images and graphics is important.

With the introduction of two powerful image recognition APIs, I think the iPhone maker is on the right path. A path that’ll lead to AR/VR apps with rich interactive interfaces.

Dictation and Better Speech Recognition

Forget text and images for a moment, iOS 16 has also revamped the Dictation feature by letting users seamlessly switch between voice and touch.

So, you could be walking down a hall and might want to quickly edit a text message on your phone. In iOS 16, you can quickly use your voice to easily modify a piece of text.

Want more? The Speech framework has got a little enhancement — the ability to toggle punctuations in SFSpeechRecognitionRequest through addsPunctation. I’m optimistic this will give rise to rich communication apps as it's already found its way into live captions in FaceTime calls.

From a mixed reality perspective, these are huge changes. Using voice to enter text would reduce our dependency on keyboards in the VR world. Apple’s also making it easy to integrate Siri into our apps using the new App Intents framework.

Finer Control Over User Interface

In iOS 16, plenty of new UI controls were announced — predominantly in SwiftUI — the declarative framework that promises to be the one-stop solution for building apps across all Apple platforms.

Amid the bevy of SwiftUI features, the changes in the WidgetKit framework piqued my interest the most. With iOS 16, developers can now build widgets for the lock screen. Also, using the same code, you can build widgets for different watch faces as well. We can also create Live Activities within WidgetKit to provide real-time updates to the user.

I truly feel widgets are gonna be the future of apps — at least in the AR/VR space as users would look to consume information and browse content without opening the app.

Alongside WidgetKit, we’ve got a new WeatherKit framework to help keep your apps and widgets updated with the latest weather information. Here’s a glimpse of how it’ll look on the lock screen.

Beyond the frameworks, SwiftUI also gifted us a slew of tiny-sized controls — like Gauges, which would neatly integrate with widgets. Then there’s SpatialTapGesture — a gesture to track tap location in a SwiftUI view.

I particularly loved the ImageRenderer API that lets you convert SwiftUI views into images. Coupled with the Transferable protocol, drag n drop of media elements across apps is about to get a lot simpler — even more so as we now have a native share sheet control in SwiftUI. Here’s a look at how drag and drop makes it so easier to cut out subjects from photos and share it in other apps:

Source: Apple

To build interactive apps for a mixed reality headset, our ways of interacting with text, images, voice, graphics and other forms of media need to become more efficient.

I think Apple has taken noteworthy steps in that direction not just through the above-mentioned UI and gesture controls but also through an upgraded SharedPlay API, a new Shared With You framework, and Collaboration API.

These all look like promising building blocks for Apple’s much-awaited headset.

RoomPlan API and Background Assets Framework

The Background Assets framework is another tool that didn’t get a lot of limelight yet. Introduced to handle the downloads of large files across different app states, I think the possibilities extend beyond this utility.

By downloading 3D assets from the cloud, we can quickly build and ship augmented reality apps with much smaller bundle sizes.

Similarly, the RealityKit framework didn’t get any significant changes. But Apple quietly unveiled a new RoomPlan API.

Powered by ARKit 6 (which did get some notable improvements this year), the Swift-only API provides out-of-the-box support for scanning rooms and building 3D models out of it.

Now, one can deem the RoomPlan API as an extension of the Object Capture API but think of it in AR/VR terms, and considering the fact that Apple’s mixed reality headset would have LiDAR sensors and multiple cameras, RoomPlan is going to be a game-changer for developers. Expect a lot of AR apps that let you reconstruct houses.

While those were the major APIs that I think would nicely fit in the mixed reality future, Spatial , is another new framework that enables working with 3D math primitives. It might prove its metal in dealing with graphics in the virtual space.

In the end, Apple didn’t utter a single word about its AR/VR headset plans, but the new APIs they released this year will play a crucial role in plugging all the pieces for metaverse development.

It’s important to prepare developers to build apps for the new ecosystem today itself. After all, for a product to achieve widespread adoption, there needs to be a mature ecosystem of apps — which requires getting developers on board.

--

--

iOS and Android Developer. Online Writer. Editor @BttrProgramming. Marketer. Wannabe Filmmaker, and a Funny Human bot!