WebXR Device API: Initialization

The WebXR Device API is a set of standards that enable immersive web experiences (AR, VR, MR) on the browser. Content is considered to be “immersive” if it produces visual, audio, haptic, or other sensory output that simulates or augments various aspects of the user’s environment. This API allows us to create 3D experiences with the help of a graphics renderer library, WebGL, and web development tools such as HTML, CSS, and JavaScript. End users interact with the XR experiences in browsers running on their devices. The WebXR Device API works with most browsers that support the latest W3C web standards.

Press + to interact

The illustration above shows the steps in an XR development cycle that uses the WebXR Device API.

Setting up WebXR

Let’s see how we can set up a simple WebXR program that calls the API functionalities in the correct order. The API’s core functionalities are to:

Detect if XR capabilities are available on the browser.
Query the XR device capabilities.
Poll the XR device and associated input device state.
Render scenes on the XR device at the appropriate frame rate.

We create a simple HTML file that checks WebXR Device API support in the following code by checking the browser’s window.navigator object. We write the JavaScript code in a <script> tag in the HTML file. Click the “Run” button to check for WebXR support.

Let’s look at each of these modes:

inline: This session is supported by any user agent with WebXR. It doesn’t require any special hardware. The output of the session is presented inline within the context of an element in a standard HTML document, rather than occupying the full visual space. It can be presented in monosingle output channel or stereodual output channel mode. Depending on the application, positional and rotational tracking is also available using input from a mouse, keyboard, or touch.
immersive-vr: This session is supported by immersive VR devices such as HTC Vive, Meta Quest, and Pico. As the name suggests, it produces an artificial environment that can’t be overlaid or integrated with the surrounding environment.
immersive-ar: This session is supported by immersive AR devices (e.g., smartphones). The output content is blended with the real-world environment.

Now, let’s check whether immersive-vr session mode is supported in our browser:

To initiate a connection with the XR device, we request the navigator.xr object (an instance of the XRSystem class) to provide us with an XRSession object via the requestSession() method. Similar to the isSessionSupported() method, the requestSession() method also takes an XRSessionMode enum as an argument. Typically, after confirming the support for a particular XRSessionMode enum via the isSessionSupported() method, the requestSession method is invoked with that session mode to retrieve the XRSession object. Further interactions with the XR device can then be conducted via this XRSession object; therefore, we store its reference in an accessible scope.

Graphical Output

To display content on the end device, there are primarily two steps involved: rendering the content and updating the state on the end device. Let’s look at rendering first.

Rendering the content

The Canvas API enables the browser to draw graphics onto an HTML element, namely the <canvas> element. It does so by utilizing a graphics library behind the scenes, such as WebGL, WebGPU, etc. A <canvas> element is assigned a graphics library via the getContext() method, that also creates and returns a context objectA context object is a JavaScript object that stores the state variables needed to interact with a library or an API, in this case, a graphics library. for the specified library. The syntax of the getContext() method is as follows:

contextType is a string specifying the graphics library and contextAttributes is an optional object specifying parameters for the said graphics library. For our purpose, we’ll be using webgl (WebGL v1) as the contextType option and { xrCompatible: true } as the contextAttributes option that will configure WebGL to render content in a mode compatible with an XR device.

Updating the state of the device

To hook our rendered content with the end device, we need an interface called the XRWebGLLayer interface. This layer bridges an ongoing XRSession object with a graphics library’s context object (the WebGLRenderingContext interface in our case). This layer is then passed as an argument to the updateRenderState() method of the XRSession object, effectively making it the base layer (or the primary layer) for this session.

The following code snippet sums up this process:

Code explanation:

Line 13: When the web page loads, we call the initializeXR() function.
Lines 16–38: In the initializeXR() function, we check if WebXR capabilities are supported in the browser. Then, we check what type of session is supported. We have set it to inline because we’re using the browser. To give the user the ability to choose the XRSessionMode enum, we can add a drop-down menu with the three options, and on line 17, instead of using a hard-coded value, we can pick up the value from the drop-down menu. Additionally, instead of triggering the initializeXR() function in the window.onload event, we call it upon the click of an HTML button.
Line 40: When we know that the session is supported, we call the beginXRSession() function to set up the XRSession object and create the XRWebGLLayer interface.
Line 61: We print the XRSession object in the browser’s console as well as add it to the web page via the document.write() method.

Here’s the output we see in the browser’s console. Depending on different browsers, we might see different variables in the XRSession object.

Press + to interact

Getting Started

XR Landscape

Introduction to A-Frame

Creating the Virtual Environment

Interaction with the Objects

Enhancing the Environment

Epilogue

Setting up WebXR

Detecting and advertising XR capabilities

Graphical Output

Rendering the content

Updating the state of the device

Code example

Code explanation: