Perfect is the enemy of the good. This is a lesson I personally struggle with, and it's the battle I've been fighting this week.
This week's goal was to get game-related pixels on the screen. No design work, no concept art, no writing lore. Get something actually working.
The First Screenshot
Here's the result:
This terrain is actually a simplified version of something I had working at the end of last year. But I was ruthless with myself this time. I ripped out everything that made it complicated:
The original version had a more complex texture varying with height and coastline, plus noise, to try and make it look pretty. All gone. I can make it look pretty when there's something more than just hills and water.
The vertex shader did a bunch of processing at vertices in the centre of each tile to interpolate the slopes correctly. In the new version, the central vertices are generated on the CPU, which is much simpler and much easier to reason about.
The entire world was dynamically mapped onto the surface of a paraboloid centered on the camera, to give the illusion of a spherical planet with a curved horizon. It looked really cool, but it also complicated mouse picking like nobody's business. It's gone.
Without that stuff it's not only simpler, it's better code, too. My goal is to rid myself of the distractions, and find firm ground to build a game on.
I couldn't resist putting some texture on it, though - completely flat colours are just too ugly!
Edge Cases are Not Worth It
A heightmap is a big 2D array, like an image. Several things I need to do involve looking at each sample's immediate neighbours:
- Calculating normals.
- Interpolating between samples for the vertex in the centre of each tile.
- Filtering the height samples to smooth the terrain.
Finding the neighbours of a sample in a regular grid is easy. Until you get to the samples at the very edge, which are missing one or more of their neighbours. You have two options:
Put conditionals in your inner loop to check for the edges. Makes my optimization muscle itch - branching should be avoided in tight loops like this.
Write nine versions of your compute kernel - the normal one, one for each of four edges, and one for each of four corner samples. Yuck.
But of course there is also a third option:
- Allocate an array slightly larger than you need, providing a one-sample border around the edges. Place default - or wrapped - values into these border samples.
You can then run the kernel on all the inner samples without worrying about the edges, and they'll pick up the border values automatically. This greatly simplifies your code at the cost of a very small amount of memory.
The Furrow Filter
The hills are generated using the standard diamond square algorithm, then smoothed out a little bit using a filter that averages samples together.
But when I looked at the terrain closely, I discovered that something was causing strange ripples or furrows across the landscape:
I couldn't work out the cause of this effect. Was it a precision issue - the height of each sample is stored as a single byte? Was it a bug in the interpolation that generates the central vertices?
I investigated. I expanded the precision to floats. I painted the central vertices a different colour. I ruled out both possibilities - the furrows persisted, and not just in the central vertices.
The aliasing was actually down to the construction of my smoothing filter. My original filter only averaged adjacent samples along the horizontal and vertical axes. For a proper box filter, you actually need all of the adjacent samples, including the diagonals.
If I ever need a filter that produces furrows, I know how to build one...
texelFetch() Does Not Wrap
The heightmap is a texture, and each sample a pixel. The terrain mesh doesn't have heights baked in. It's actually a flat grid, and each vertex is given a height in the vertex shader by sampling the texture.
This way we can use a single, fairly small terrain mesh to render any section of the full terrain data.
Reading the OpenGL specification carefully,
texelFetch() - which I'm using
to read the height sample from an integer texture - doesn't use the sampler's
wrapping or edge clamping modes. It's actually undefined outside the bounds
of the texture.
So you have to be careful to do wrapping/clamping yourself!
More Engine Work
I didn't escape low-level stuff completely this week. Moving the camera requires information from the scroll wheel, and I wanted the ability to drag the camera around without actually moving the mouse pointer.
Pointer lock is one of those things that seems like it should be a single function call, but actually requires a little bit of hackery on most platforms. The way toolkits end up doing it is:
- Hide the cursor.
- Warp the pointer to the centre of your window.
- Whenever the mouse moves, calculate the delta between its current and previous position. Report the delta to the game or application.
- If the pointer moves too far from the centre, warp it back. Ignore the resulting mouse movement when calculating deltas.
- When you unlock the pointer, warp it back to its original position and unhide the cursor.
macOS actually has a handy function
which removes the need for all that, but - at least as far as I could
determine - X11 needs the full incantation.
xcb and poll()/select()
Something that was bugging me about my X11 backend was a strange effect that occurred when resizing the window.
I should recieve ConfigureNotify events from the server as the window resizes. But if my game was busy rendering frames while the user was dragging, I didn't receive any events. Until the end of the resize, at which point I received a whole batch of events all at once.
Was the X server somehow aware that I was busy drawing? Was there something I
was missing about Linux's implementation of
poll() which meant that I was
missing wakeup notifications on the X socket?
The engine's eventloop does this:
- Wait for incoming data on either the X socket or an internal pipe, using
- If there's incoming data from the X server, read and process all X messages
xcb_poll_for_event()in a loop.
- If there's incoming data on the pipe, we've sent ourselves a message - probably a vsync notification telling us it's time for the next frame.
- If we got a vsync notification, process a frame.
xcb_flush()to ensure queued outgoing messages are sent to the X server, then loop back to wait for more incoming data.
But I was missing something about xcb.
xcb_poll_for_event() is not the
only xcb function which can end up reading from the X socket. There is an
internal queue of incoming packets inside the xcb library, and events can be
read from the socket and placed on the queue by essentially any xcb function.
I was missing ConfigureNotify events because:
- My code runs its
xcb_poll_for_event()loop. After this, there are no more queued events inside xcb, and no more data is available on the socket.
- ConfigureNotify arrives on the socket before my code calls
xcb_flush()- or another xcb function - causes xcb to read the ConfigureNotify from the socket and place it on the internal queue.
- My code calls
poll(), but even though there is an unprocessed event in the queue there is no longer any incoming data on the socket.
So if we don't want to miss messages, it's only actually safe to wait when
we know that all messages that xcb has read from the socket have been
processed - i.e. directly after
My empty landscape needs two things:
- People and other animals.
That's this week's goal. Surely I can manage at least one, if not both?!