Stylized image of raytraced spheres with floating equations

Hardware-Agnostic Accelerated Raytracing

Benson Haley

Introduction

This project explores implementing a raytracer that is not tied to acceleration structures from a specific vendor. Most modern raytracers are accelerated with NVIDIA's OptiX technology, which limits users to machines with an NVIDIA GPU. However, with the advent of arbitrary GPU kernels (CUDA, HIP, etc.) and new universal device standards (SYCL), it is possible to write hardware-agnostic code that can take advantage of any variety of GPU if one exists.

At the core of this project is SYCL, a standard developed by Khronos that defines a common framework for offloading repetitive operations to devices that support parallel computing. Many implementations of the SYCL standard exist, but this project uses AdaptiveCpp due to its open-source nature. There are many benefits to using SYCL over device-specific code:

  • Hardware Agnostic: the SYCL standard supports many varieties of GPUs, and other parallelism models as well
  • Future Compatible: new devices can become compliant with the SYCL standard
  • Modern C++: much of CUDA resembles pre-11 C++, and using APIs like Vulkan or DirectX require additional shading languages (GLSL, HLSL, etc.)
  • Open Source: AdaptiveCpp is an open-source implementation of the SYCL standard

Architecture

The raytracer follows a typical architecture, where the camera launches a ray of inverse-light through each pixel of the film plane that bounces around the scene to determine the pixel's final color. Unlike OptiX, which only supports triangle-based meshes, this engine supports spheres and triangles by default and also gives users the ability to add new object types, as well as extend the sphere or triangle object types to include extra vertex information (color, UV coordinates, etc.).

Raytracing illustration

The underlying architecture requires data to be parallelizable, meaning all object data must be stored in-place in buffers, rather than scattered across the heap. User-manipulation is performed through a callback interface and a shader interface, both implemented with C++ lambdas. Callback functions can be passed into the engine upon construction, and procedural textures can be added to objects by implementing shader lambdas which accept an input info structure (containing object and collison data) and return a luminance value. In order for the engine to be usable with heterogeneous systems, exceptions cannot be used within the engine (although they do not need to be disabled for the remainder of a user's project) and dynamic polymorphism is disallowed (due to virtual tables). This means that the object polymorphic facilities are implemented using variants and SFINAE

System

The core of the data parallelism in the system is in C++ standard allocators, extended to perform SYCL allocations, allowing STL vectors to be used on SYCL universal shared memory (USM) pointers. SYCL buffers and accessors are avoided due to their documented inefficiencies when compared to USM pointers.

C++ code

Dependencies

  • C++20
  • Clang extended with AdaptiveCpp
  • GPU with compute support (for best performance)
  • NVIDIA CUDA Library (for device compatible variant, tuple, optional, array, etc.)
  • Eigen (for efficient vector and matrix math)
  • Happly (to easily load triangle meshes from PLY files)

Results

A Whitted-esque scene with a moving sphere runs at ~144 frames per second (although the sample below runs at 30 FPS due to GIF compression). The default output of the project uses a websocket, which will also reduce the framerate based on connectivity, so for the best results users should route frames to their own client-side output interface.

The raytracer in action

Future Work

In the future, this project could be improved and extended to produce higher fidelity images and shadows by using photon maps or path tracing. Additionally, the KD-Tree implementation, which currently only runs on the CPU, could be rewritten to function on-device.

Download

The project code can be downloaded here.