2D Scenes and Geometry

Key Points:

[ ] Fundamental vector algebra for games
[ ] Sprites as a graphical building block
[ ] Coming up with GPU rendering schemes
[ ] Moving data to the GPU

Space in Two Dimensions

For this part of the course, we'll use a two dimensional representation of space. Let's do a quick refresher on computational geometry.

Vectors

Points and vectors are technically different
- But we'll treat them the same for now
A vector can be defined with components, e.g. \(x\) and \(y\)
- Vectors can be added, subtracted, divided, multiplied component-wise
- Vectors can also be scaled by multiplying with a scalar, i.e. a number
Vectors are displacements between points
- i.e., the difference between two points is a vector "explaining" how to get from one to the other
  - Its magnitude is the distance between the points
  - Its direction is the angle between the points
- A point is like a vector starting from (0,0)
We compute a vector's magnitude using the Pythagorean theorem (\(\sqrt{x^2+y^2}\))
We compute a vector's direction using the arctangent (v.y.atan2(v.x)=)
We can get the vector perpendicular (or normal) to a vector by swapping its components and negating \(x\)
We can also normalize a vector (this means something different!) by dividing its components by its magnitude

Axes and coordinate systems

In 2D we work in two major axes: x (horizontal) and y (vertical).
- In WGPU we use a space where upwards in y is positive, but this is an arbitrary choice. Downwards y positive is arguably more common in graphics and games.
Points are only defined relative to an origin or basis.
- Usually we use \((0,0)\) — 0 in x and y.
- Other choices of basis are possible: \((3,0)\) with respect to \((1,1)\) is \((4,1)\)
Translations (offset point \(p\) by vector \(v\)) are the only things we can do to points.
Vectors are also coordinate pairs, but they mean something different
- A direction and a magnitude
- Or equivalently, a magnitude in x and a magnitude in y
We can move a vector's end point to adjust its direction and magnitude (moving the tip of the arrow)
- This is actually a kind of scaling—imagine stretching or squashing the arrow.
If we have a point and a scalar (e.g., a circle), we can both translate (the point) and scale (the scalar)
If we have a point and a vector (e.g., an ellipse or a rectangle), we can both translate (the point) and scale (the vector)
When we have multiple coordinate spaces, we can convert between them using transformations by defining one in terms of the other
- Look at a nearby wall with a window or other rectangle on it.
- The bottom left corner of the wall anchors (or is the basis or origin of) one coordinate space.
- The bottom left corner of the window (or whatever) anchors another.
- If we measured from the right edge to the left edge of the window, we'd have a distance \(w\)
  - The scale of the two spaces is the same, so this distance means the same in both spaces
- In the window's coordinate space, \((0,w)\) is the bottom right corner of the window
- In the wall's coordinate space, \((0,w)\) is along the bottom edge of the wall and might not be anywhere near the window!
  - We would need to measure both the horizontal and vertical distance from the bottom left of the wall to the bottom right of the window to obtain that point.
  - Or, if we knew where the window's bottom left corner \((x,y)\) was with respect to the wall, we could just add that to \((0,w)\) to get \((x,y+w)\)
- We'll talk a lot more about coordinate spaces in the 3D unit.
Last idea: It's convenient to have a tree or hierarchy of coordinate spaces, so we can have things like characters holding objects or riding vehicles, or characters made of multiple independently moving pieces that must stick together.

Angles

In 2D, any rotation can be described in terms of a point and the amount of rotation about that point in the xy plane.
- We'll usually represent this amount as a scalar of radians
Examples:
- a sprite could rotate around its center by \(\pi\), flipping upside-down
- a sprite could rotate about the origin by \(\pi/2\), moving a quarter-circle around the origin
- a sprite could orbit another sprite's center by \(2\pi/60\), completing one rotation per second at 60 fps
In 3D there are more planes in which rotation can happen, so it's a little trickier

Data Structures

So far we're using arrays or tuples of floats to represent points, vectors, rectangles, and everything else. This is not the best in general (for example, we can't easily add or subtract two pairs of floats), but I'll continue to use it in the examples for now because it doesn't require extra explanation or API surface.

In your own projects, consider using a math crate like glam, nalgebra, ultraviolet, or maths-rs. I'll be using glam in later weeks but they're all good choices.

Sprites on Screen

With the math in hand, we can start to make real games! One really useful building block for 2D games is the sprite: a bitmapped rectangle, often with transparency. In other words, a little picture we can move around!

In today's example, we'll show how to use buffers to transfer arbitrary data to the GPU. We'll use two buffers: one for information about the camera (we'll talk about that later) and one for an array of sprites.

This time around, a sprite will consist of a "to rectangle" (an area of the screen that the sprite should take up) and a "from rectangle" (an area of a texture that the sprite's visual data should be drawn from).

struct GPUSprite {
    screen_region: [f32;4],
    // Textures with a bunch of sprites are often called "sprite sheets"
    sheet_region: [f32;4]
}

Making these rectangles possibly be different sizes gives us the ability to scale our sprites up and down on screen; if we wanted to rotate sprites too, we'd need to add a number in radians to our struct (and possibly a point about which to do the rotation).

We actually need to include a new crate this time, bytemuck, which will let us easily convert a slice of GPUSprite values into a slice of bytes for WGPU to send over in a buffer (so, cargo add bytemuck --features derive). The full definition of GPUSprite is therefore:

#[repr(C)]
#[derive(Clone, Copy, bytemuck::Zeroable, bytemuck::Pod)]
struct GPUSprite {
    screen_region: [f32;4],
    sheet_region: [f32;4]
}

#[repr(C)] tells the Rust compiler to lay this struct out in memory according to the C programming language's rules, and the bytemuck derives generate code that will let us convert a GPUSprite to and from raw bytes safely.

We'll define a vec of sprites in our declarations somewhere. Notice two things about these regions: (1) the screen region is defined in something that looks like pixel units; (2) the sheet region is defined in normalized (UV) coordinates. For our spritesheet we'll be using this 32x32 image, of which the bottom two 16x16 images are our sprites:

let mut sprites = vec![
    GPUSprite {
        screen_region: [32.0, 32.0, 64.0, 64.0],
        sheet_region: [0.0, 16.0/32.0, 16.0/32.0, 16.0/32.0],
    },
    GPUSprite {
        screen_region: [32.0, 128.0, 64.0, 64.0],
        sheet_region: [16.0/32.0, 16.0/32.0, 16.0/32.0, 16.0/32.0],
    },
    GPUSprite {
        screen_region: [128.0, 32.0, 64.0, 64.0],
        sheet_region: [0.0, 16.0/32.0, 16.0/32.0, 16.0/32.0],
    },
    GPUSprite {
        screen_region: [128.0, 128.0, 64.0, 64.0],
        sheet_region: [16.0/32.0, 16.0/32.0, 16.0/32.0, 16.0/32.0],
    },
];

Now we could write gameplay code that modifies screen_region to move our sprites around the screen, or creates new sprites or removes them during play!

Detour: Camera and Scrolling

Often, our game worlds are bigger than a single screen—or we might want to zoom in and out on the action. To achieve this aim, we can introduce the metaphor of a "camera" which moves around the world. This ends up introducing yet another new coordinate space. If you're keeping track, that means we now have at least this many spaces:

Viewport space (pixels in your window)
Framebuffer space (pixels in the GPU's framebuffer texture)
Clip space (in which GPU vertices enter the rasterizer)
Normalized device coordinate space (in which GPU vertices exit the vertex shader, more or less)
UV coordinate/texture space (in which texture sampling happens)
Camera space (within the current view, where is something?)
World space (within the whole simulated world, where is something?)

Phew! Anyway, it can be helpful to imagine the camera as a cut-out rectangle which is sliding over the world, so that as the camera moves rightwards in world space, everything else apparently moves leftwards relative to the camera (for example). Here's what a cut-out rectangle like that looks like in code:

#[repr(C)]
#[derive(Clone, Copy, bytemuck::Zeroable, bytemuck::Pod)]
struct GPUCamera {
    screen_pos: [f32;2],
    screen_size: [f32;2]
}

Back in our shader, this is used in just one place: translating the output vertices according to the screen position, then scaling to NDC space by dividing through the screen size.

(corner + vec4(which_vtx*size,0.,0.) - vec4(camera.screen_pos,0.,0.)) / vec4(camera.screen_size.x, camera.screen_size.y, 1.0, 1.0)

Who knew cameras were just math all along?

The Sprite Shader

Let's elide for now how exactly we get this buffer into the shader, and look at how our shader code will want to use these values:

// A square made of two rectangles. This makes our vertex shader
// code simpler since we can look up the corners by number.
var<private> VERTICES:array<vec2<f32>,6> = array<vec2<f32>,6>(
    // Bottom left, bottom right, top left; then top left, bottom right, top right..
    vec2<f32>(0., 0.),
    vec2<f32>(1., 0.),
    vec2<f32>(0., 1.),
    vec2<f32>(0., 1.),
    vec2<f32>(1., 0.),
    vec2<f32>(1., 1.)
);

// Our camera struct
struct Camera {
    screen_pos: vec2<f32>,
    screen_size: vec2<f32>
}

// GPUSprite, from before
struct GPUSprite {
    to_rect:vec4<f32>,
    from_rect:vec4<f32>
}

// One binding for the camera...
@group(0) @binding(0)
var<uniform> camera: Camera;
// And another for the sprite buffer
@group(0) @binding(1)
var<storage, read> sprites: array<GPUSprite>;

// Same as before
struct VertexOutput {
    @builtin(position) clip_position: vec4<f32>,
    @location(0) tex_coords: vec2<f32>,
}

@vertex
fn vs_main(@builtin(vertex_index) in_vertex_index: u32,
           // Which instance, i.e. which specific sprite are we drawing now?
           @builtin(instance_index) sprite_index:u32) -> VertexOutput {
    // The corner and size of the sprite in world space.
    // Which sprite? sprites[sprite_index]
    let corner:vec4<f32> = vec4(sprites[sprite_index].to_rect.xy,0.,1.);
    let size:vec2<f32> = sprites[sprite_index].to_rect.zw;
    // The corner and size of the texture area in UVs
    let tex_corner:vec2<f32> = sprites[sprite_index].from_rect.xy;
    let tex_size:vec2<f32> = sprites[sprite_index].from_rect.zw;
    // Which corner of the square we need to draw now (in_vertex_index is in 0..6)
    let which_vtx:vec2<f32> = VERTICES[in_vertex_index];
    // Which corner of the UV square we need to draw (UV coordinates are flipped in Y)
    let which_uv: vec2<f32> = vec2(VERTICES[in_vertex_index].x, 1.0 - VERTICES[in_vertex_index].y);
    return VertexOutput(
        // Offset corner by size * which_vtx to get the right corner, then do camera stuff. Dividing screen size by 2 and the last subtraction are to deal with the NDC coordinate space, which goes from -1 to 1 in WGPU.
        ((corner + vec4(which_vtx*size,0.,0.) - vec4(camera.screen_pos,0.,0.)) / vec4(camera.screen_size/2., 1.0, 1.0)) - vec4(1.0, 1.0, 0.0, 0.0),
        // Offset texture corner by tex_size * which_uv to get the right corner
        tex_corner + which_uv*tex_size
    );
}

// Now our fragment shader needs two "global" inputs to be bound:
// A texture...
@group(1) @binding(0)
var t_diffuse: texture_2d<f32>;
// And a sampler.
@group(1) @binding(1)
var s_diffuse: sampler;
// Both are in the same binding group here since they go together naturally.

// Our fragment shader takes an interpolated `VertexOutput` as input now
@fragment
fn fs_main(in:VertexOutput) -> @location(0) vec4<f32> {
    // And we use the tex coords from the vertex output to sample from the texture.
    let color:vec4<f32> = textureSample(t_diffuse, s_diffuse, in.tex_coords);
    // This is new: if the alpha value of the color is very low, don't draw any fragment here.
    // This is like "cutout" transparency.
    if color.w < 0.2 { discard; }
    return color;
}

The WGPU Code

Before we can tell the driver to send this delicious data over to the GPU, we have to bump up our required GPU features a bit:

let (device, queue) = adapter
    .request_device(
        &wgpu::DeviceDescriptor {
            label: None,
            features: wgpu::Features::empty(),
            // Bump up the limits to require the availability of storage buffers.
            limits: wgpu::Limits::downlevel_defaults()
                .using_resolution(adapter.limits()),
        },
        None,
    )
    .await
    .expect("Failed to create device");

Then, like before, we need to define a bind group layout for our sprite data:

let sprite_bind_group_layout =
    device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
        label: None,
        entries: &[
            // The camera binding
            wgpu::BindGroupLayoutEntry {
                // This matches the binding in the shader
                binding: 0,
                // Available in vertex shader
                visibility: wgpu::ShaderStages::VERTEX,
                // It's a buffer
                ty: wgpu::BindingType::Buffer {
                    // Specifically, a uniform buffer
                    ty: wgpu::BufferBindingType::Uniform,
                    has_dynamic_offset: false,
                    min_binding_size: None
                },
                // No count, not a buffer array binding
                count: None,
            },
            // The sprite buffer binding
            wgpu::BindGroupLayoutEntry {
                // This matches the binding in the shader
                binding: 1,
                // Available in vertex shader
                visibility: wgpu::ShaderStages::VERTEX,
                // It's a buffer
                ty: wgpu::BindingType::Buffer {
                    // Specifically, a storage buffer
                    ty: wgpu::BufferBindingType::Storage{read_only:true},
                    has_dynamic_offset: false,
                    min_binding_size: None
                },
                // No count, not a buffer array binding
                count: None,
            },
        ],
    });
    let pipeline_layout = device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
        label: None,
        bind_group_layouts: &[&sprite_bind_group_layout, &texture_bind_group_layout],
        push_constant_ranges: &[],
    });

It's a lot of text, but it's all pretty mechanical once you know what you're going for.

Let's see again how we make our camera and starting sprites:

let camera = GPUCamera {
    screen_pos: [0.0, 0.0],
    // Consider using config.width and config.height instead,
    // it's up to you whether you want the window size to change what's visible in the game
    // or scale it up and down
    screen_size: [1024.0, 768.0],
};
let sprites:Vec<GPUSprite> = vec![
    GPUSprite {
        screen_region: [32.0, 32.0, 64.0, 64.0],
        sheet_region: [0.0, 16.0/32.0, 16.0/32.0, 16.0/32.0],
    },
    GPUSprite {
        screen_region: [32.0, 128.0, 64.0, 64.0],
        sheet_region: [16.0/32.0, 16.0/32.0, 16.0/32.0, 16.0/32.0],
    },
    GPUSprite {
        screen_region: [128.0, 32.0, 64.0, 64.0],
        sheet_region: [0.0, 16.0/32.0, 16.0/32.0, 16.0/32.0],
    },
    GPUSprite {
        screen_region: [128.0, 128.0, 64.0, 64.0],
        sheet_region: [16.0/32.0, 16.0/32.0, 16.0/32.0, 16.0/32.0],
    },
];

That's all on the CPU. So now we need to create GPU-side buffers to hold their data:

let buffer_camera = device.create_buffer(&wgpu::BufferDescriptor{
    label: None,
    size: bytemuck::bytes_of(&camera).len() as u64,
    usage: wgpu::BufferUsages::UNIFORM | wgpu::BufferUsages::COPY_DST,
    mapped_at_creation: false
});
let buffer_sprite = device.create_buffer(&wgpu::BufferDescriptor{
    label: None,
    size: bytemuck::cast_slice::<_,u8>(&sprites).len() as u64,
    usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_DST,
    mapped_at_creation: false
});

Then we need to fill the buffers with the actual data:

queue.write_buffer(&buffer_camera, 0, bytemuck::bytes_of(&camera));
queue.write_buffer(&buffer_sprite, 0, bytemuck::cast_slice(&sprites));

And finally, we define a bind group that actually will actually pull these buffers into the shader:

let sprite_bind_group = device.create_bind_group(&wgpu::BindGroupDescriptor {
    label: None,
    layout: &sprite_bind_group_layout,
    entries: &[
        wgpu::BindGroupEntry {
            binding: 0,
            resource: buffer_camera.as_entire_binding()
        },
        wgpu::BindGroupEntry {
            binding: 1,
            resource: buffer_sprite.as_entire_binding()
        }
    ],
});

And of course later on, we need to actually use this bind group and draw our triangles. For this, we'll use a convention called "instanced rendering", where the same vertices are used on repeat for each instance of an object to be drawn. We still don't have any vertex buffer or index buffer, so it'll just be the vertex indices 0 through 6 on repeat, however many times as we have sprites:

let mut rpass = encoder.begin_render_pass(&wgpu::RenderPassDescriptor {
    label: None,
    color_attachments: &[Some(wgpu::RenderPassColorAttachment {
        view: &view,
        resolve_target: None,
        ops: wgpu::Operations {
            load: wgpu::LoadOp::Clear(wgpu::Color::BLACK),
            store: true,
        },
    })],
    depth_stencil_attachment: None,
});
rpass.set_pipeline(&render_pipeline);
rpass.set_bind_group(0, &sprite_bind_group, &[]);
rpass.set_bind_group(1, &texture_bind_group, &[]);
// draw two triangles per sprite, and sprites-many sprites.
// this uses instanced drawing, but it would also be okay
// to draw 6 * sprites.len() vertices and use modular arithmetic
// to figure out which sprite we're drawing, instead of the instance index.
rpass.draw(0..6, 0..(sprites.len() as u32));

Phew! The last thing to do is update those buffers when we need to. We'll just write out the entire buffer each frame, but it would be better to only modify the parts that change (I'll leave that as an exercise for the reader).

Event::MainEventsCleared => {
    // TODO: move sprites, maybe scroll camera
    // Then send the data to the GPU!
    queue.write_buffer(&buffer_camera, 0, bytemuck::bytes_of(&camera));
    queue.write_buffer(&buffer_sprite, 0, bytemuck::cast_slice(&sprites));

    // ...all the drawing stuff goes here...

    window.request_redraw();
    // Leave now_keys alone, but copy over all changed keys
    input.next_frame();
},

Lab: Fancier Sprites

Once again we're staring down four hundred lines of code, but at this point we have what we could charitably call a sprite renderer and we can get started making Real Games.

This lab is split into two categories: game programming and graphics programming. Pick and choose from the categories and show me what you've got. It might be a solid foundation for your first unit 2 game! Next time around we'll look at how to organize our code a little more cleanly and get more of a game/engine split going.

Game programming
- Use your own spritesheet
- Arrange a world out of big and small sprites
  - The rotation feature in the graphics programming area could be really useful
- Have some sprites move randomly or according to some rules
- Control two different sprites (with different keys?) and notice when they overlap
- Make a game where you click on sprites to "destroy" them (whatever that means!)
- Implement scrolling by moving the camera around; possibly implement zooming in and out too
  - Do you need to change GPUCamera and the shader to implement zooming? Why or why not?
- Implement sprite animation by cycling a sprite between different sheet_regions from time to time
- Write a convenience wrapper around Texture called SpriteSheet which maps sprite indices or names onto sheet_region values, to make updating the GPUSprites easier
- Next to sprites, have a parallel Vec of Thing structs
  - Thing could have properties like health, speed, or other "gamey" things
  - As long as things[i] corresponds to sprites[i], it's easy to keep both in sync when you change one
  - Make the sprites' appearance or behavior depend on their corresponding Thing
Graphics programming
- Add a rotation point and angle to GPUSprite, updating the vertex shader
- Add a color-shifting color to GPUSprite, updating the vertex and fragment shaders
  - It's easiest to pass this color offset data through to the fragment shader from the vertex shader
- Load multiple images into an array texture (modify load_texture to take &[impl AsRef<std::path::Path>] and use a larger number of depth_or_array_layers, updating the sampling in the fragment shader to match), so that you can have e.g. background objects and foreground sprites from different spritesheets.
- Make it so that your sprites stay the same shape, no matter how the window is resized
  - Part zero: Figure out your desired screen size and compute its aspect ratio by dividing its width by height (4:3 and 16:9 are common; for example, 1024x768 is 4:3)
  - Approach one: Viewport manipulation
    - Part one: When the window size changes (and at launch), compute the part of the window you want to draw into
      - Hint: either the window is too wide or too tall, so compare the window's aspect ratio to your desired ratio.
        
        If it's smaller than yours, that means the window is too narrow and you should only use some of the vertical space and all the horizontal space (but how much? Your aspect ratio will tell you!)
        
        If it's bigger than yours, then the window is too wide and you should use all the vertical space and some of the horizontal
        
        If it's the same, then you can pick either one
    - Part two: use the set_viewport method of RenderPass to prevent drawing outside of that rectangle
    - Drawback: If the window is very large or very high-resolution, you'll find that the game slows down as its size increases
  - Approach two: Render to texture
    - Part one: Instead of using the swapchain texture, draw your render pass into your own texture with the size that you selected earlier
    - Part two: Add a second render pass that draws a quad textured with your render target texture, appropriately placed within the viewport (as in approach 1)
    - Besides letting you control the resolution at which you render, this also lets you do post-processing effects on the rendered image! Cool!
    - Drawback: There might be funny interactions between your camera's screen size and your off-screen texture's size that could cause blurriness sometimes

Appendix: Making it work on WebGL2

WebGL2 doesn't support storage shaders, which makes everything a little less nice if we want to support browsers that don't implement the new WebGPU standard.

To support both platforms with storage buffers and platforms without, we can modify our code as follows.

First, near the top of the file we'll add a helpful constant to tell us if we have storage buffers or not. We'll use a feature but we could just as well use target_family = "wasm"

#[cfg(not(feature = "webgl"))]
const USE_STORAGE: bool = true;
#[cfg(feature = "webgl")]
const USE_STORAGE: bool = false;

If we use features we need to define them in Cargo.toml, and propagate the need for webgl to wgpu.

[features]
webgl = ["wgpu/webgl"]

And we need to pick our capabilities based on the constant above:

let (device, queue) = adapter
    .request_device(
        &wgpu::DeviceDescriptor {
            label: None,
            features: wgpu::Features::empty(),
            // NEW!
            limits: if USE_STORAGE {
                wgpu::Limits::downlevel_defaults()
            } else {
                wgpu::Limits::downlevel_webgl2_defaults()
            }
            .using_resolution(adapter.limits()),
        },
        None,
    )
    .await
    .expect("Failed to create device");
let supports_storage_resources = adapter
    .get_downlevel_capabilities()
    .flags
    .contains(wgpu::DownlevelFlags::VERTEX_STORAGE)
    && device.limits().max_storage_buffers_per_shader_stage > 0;

We could instead do all this with just runtime checks and not conditional compilation, but this way we can easily test out different implementations.

In our WebGL2 approach, we'll use vertex buffers which I have so far avoided explaining. Vertex buffers are traditionally how vertex data is passed to the shader for vertex being drawn. They also have a use in instanced rendering, where one per-vertex buffer is stepped through once per vertex, but another instance buffer is stepped through once per instance. We're already using the instance index for accessing our sprite data in the shader, so we could use an instance buffer instead of a storage buffer with minimal changes:

// Duplicative definition, but we need to ascribe attribute locations here:
struct InstanceInput {
    @location(0) to_rect: vec4<f32>,
    @location(1) from_rect: vec4<f32>,
};

@vertex
fn vs_vbuf_main(@builtin(vertex_index) in_vertex_index: u32, sprite_data:InstanceInput) -> VertexOutput {
    // We'll still just look up the vertex positions in those constant arrays
    let corner:vec4<f32> = vec4(sprite_data.to_rect.xy,0.,1.);
    let size:vec2<f32> = sprite_data.to_rect.zw;
    let tex_corner:vec2<f32> = sprite_data.from_rect.xy;
    let tex_size:vec2<f32> = sprite_data.from_rect.zw;
    let which_vtx:vec2<f32> = VERTICES[in_vertex_index];
    let which_uv: vec2<f32> = vec2(VERTICES[in_vertex_index].x, 1.0 - VERTICES[in_vertex_index].y);
    return VertexOutput(
        ((corner + vec4(which_vtx*size,0.,0.) - vec4(camera.screen_pos,0.,0.)) / vec4(camera.screen_size/2., 1.0, 1.0)) - vec4(1.0, 1.0, 0.0, 0.0),
        tex_corner + which_uv*tex_size
    );
}

Everywhere we specify storage buffers for our sprite buffer we now need to decide to use vertex buffers instead if storage buffers aren't available. It's a little annoying because vertex buffers aren't bound in bind groups, but instead are given explicitly in the vertex state of the render pipeline.

// ...
let sprite_bind_group_layout = if supports_storage_resources {
    // same as before
} else {
    // just the camera here
    device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
        label: None,
        entries: &[camera_layout_entry],
    })
}
// ... sometimes it's a storage buffer, sometimes it's a vertex buffer
let buffer_sprite = device.create_buffer(&wgpu::BufferDescriptor {
    label: None,
    size: sprites.len() as u64 * std::mem::size_of::<GPUSprite>() as u64,
    usage: if supports_storage_resources {
        wgpu::BufferUsages::STORAGE
    } else {
        wgpu::BufferUsages::VERTEX
    } | wgpu::BufferUsages::COPY_DST,
    mapped_at_creation: false,
});
// and we have a different layout for the storage vs vertex buffer approaches
let sprite_bind_group = if supports_storage_resources {
    // same as before
} else {
    device.create_bind_group(&wgpu::BindGroupDescriptor {
        label: None,
        layout: &sprite_bind_group_layout,
        entries: &[wgpu::BindGroupEntry {
            binding: 0,
            resource: buffer_camera.as_entire_binding(),
        }],
    })
}

Finally, we are going to need different vertex shaders and buffer descriptions depending on whether storage buffers are available:

let render_pipeline = device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
    label: None,
    layout: Some(&pipeline_layout),
    vertex: wgpu::VertexState {
        module: &shader,
        entry_point: if supports_storage_resources {
            "vs_storage_main"
        } else {
            "vs_vbuf_main"
        },
        buffers: if supports_storage_resources {
            &[]
        } else {
            // one vertex buffer...
            &[wgpu::VertexBufferLayout {
                // where each element is sizeof(GPUSprite) long...
                array_stride: std::mem::size_of::<GPUSprite>() as u64,
                // stepped once per instance...
                step_mode: wgpu::VertexStepMode::Instance,
                attributes: &[
                    // Where the first 16 bytes are a float32x4...
                    wgpu::VertexAttribute {
                        format: wgpu::VertexFormat::Float32x4,
                        offset: 0,
                        // at attribute location 0
                        shader_location: 0,
                    },
                    wgpu::VertexAttribute {
                        // and the next 16 bytes...
                        offset: std::mem::size_of::<[f32; 4]>() as u64,
                        // are also a float32x4
                        format: wgpu::VertexFormat::Float32x4,
                        // at attribute location 1
                        shader_location: 1,
                    },
                ],
            }]
        },
    },
    //...
};

Finally, we need to bind the vertex buffer (since it doesn't live in a normal bind group) during rendering:

// ... same as before
rpass.set_pipeline(&render_pipeline);
if !supports_storage_resources {
    rpass.set_vertex_buffer(0, buffer_sprite.slice(..));
}
rpass.set_bind_group(0, &sprite_bind_group, &[]);
rpass.set_bind_group(1, &texture_bind_group, &[]);
// ... same as before

It's a little bit tedious but it does work! And now one program should run fine on top of either the latest and greatest Vulkan or the oldest and most basic WebGL2 implementation.

Thanks to April U. in 181G Fall '23 for the discussion around instance-rate vertex buffers vis-a-vis storage buffers!