GPU resources

This post is basically a dump of resources I’ve encountered while doing a deep dive into GPU programming. I welcome pull requests against the repo for other useful resources. Also feel free to ask questions in issues, particularly if the answer might be in the form of a patch to this post.

Understanding the hardware

Intel

Intel is one of the best GPU hardware platforms to understand because it’s documented and a lot of the work is open source.

Wikichip gen 9, gen 9.5, gen 11
Intel white paper on Gen9 compute
Programmer’s Reference Manual for Kaby Lake (Gen 9.5)

There’s also some academic literature:

Performance Characterisation and Simulation of Intel’s Integrated GPU Architecture

One of the funky things about Intel is the varying subgroup width; it can be SIMD8, SIMD16, or SIMD32, mostly determined by compiler heuristic, but there is a new VK_EXT_subgroup_size_control extension.

NVidia

There’s a lot of interest and activity around NVidia, but much of it is reverse engineering.

AMD

Understanding API capabilities

vulkan.gpuinfo.org - a detailed database of what extensions are available on what hardware/driver/platform combinations.
Metal Feature Set Tables has similar info for Metal.

Subgroups

Subgroup/warp/SIMD/shuffle operations are very fast, but less compatible (nonuniform shuffle is missing from HLSL/SM6), and you (mostly) don’t get to control the subgroup size, so portability is a lot harder.

Languages

GLSL

https://github.com/KhronosGroup/glslang - reference implementation of GLSL, compilation to SPIR-V
shaderc - Google-maintained tools

HLSL

DirectX Shader Compiler (DXC) - produces both SPIR-V and DXIL.
Programming guide for HLSL
Shader Model 6

Metal Shading Language

Metal Shading Language Specification

OpenCL

clspv - compile OpenCL C (subset) to run on Vulkan compute shaders.
- To me, this is evidence that Vulkan will simply eat OpenCL’s lunch. This is still controversial, but Khronos people are insisting there’s an “OpenCL Next” roadmap.
OpenCL 3.0 is recently announced, and their plans do include clspv and related tools to run on a Vulkan.

TensorFlow

MLIR

Exotic languages

Halide
Futhark
Co-dfns
Julia on GPU - layered on CUDA

SPIR-V

SPIRV-Cross - transpile SPIR-V into GLSL, HLSL, and Metal Shading Language
- This is an integral part of portability layers including MoltenVK and gfx-rs.

WebGPU

Building WebGPU with Rust - FOSDEM talk
wgpu - Rust WebGPU implementation
dawn - Google’s WebGPU implementation in C++
Work-in-progress specification
Get started with GPU Compute on the Web - Google (Chromium/Dawn)

WebGPU shader language

The discussion of shader language had been very contentious. As of very recently there is a proposal for a textual language that is semantically equivalent to SPIR-V, and there seems to be agreement that this is the path forward.

The previous proposals were some profile of SPIR-V, a binary format, and Apple’s Web High Level Shading Language proposal, which evolved into Web Shading Language. Both of these had disadvantages that made them unacceptable to various people. It’s not possible to use SPIR-V directly, largely because it has undefined behavior and other unsafe stuff. The Google and Mozilla implementations addressed this by doing a rewrite pass. Conversely, Apple’s proposal met with considerable resistance because it didn’t deal with the diversity of GPU hardware in the field. There’s a lot of ecosystem work centered around Vulkan and SPIR-V, and leveraging that will help WebGPU considerably.