Fomenko

Big Winter Cleanup

By Alexandre Chêne|March 01, 2026

Those last few days I got distracted with rendering stuff. It was overly too complex, and now that I’m back being by myself on the project, it was time to think on how to make the code as simple as possible, and have a sense that everything is under control.

Meanwhile, for my current client, I’m dealing with d3d11 and I kinda like the API. It’s been a while I didn’t played with a graphic API with immediate mode feeling, which is structured as a giant state machine. So I decided to slowly shape my renderer similar to that. The first thing was the PSO. For that it’s pretty simple, I’ve a global graphic context with everything in it, like which vertex and pixel shaders we want to use, then color formats, blend modes, depth format, and everything else. And i’ve a bunch of function like set_depth_state() or set_color_format(), and such, which mutate the state. And at the very very end, I’ve some code that will transform this state into a key to lookup existing PSO into a table of cached ones.

Next, I realized that we had many different shader which conveyed the same purpose, like our depth pre-pass shader was spread into 3 different shader file, one for opaque material, one for terrain material and one for foliage material.... I decided to go with only one shader, with a “big” switch in it. It’s dead simple, and it’s not a problem because we’re 100% sure all thread in the warp are going into a single branch, so they are skipping the others, no extra cost. Again, simplicity is the key.

Slang is used for shaders, and it works pretty well until now. The only thing tho, is that we were translating Slang shaders to SPIR-V, then SPIR-V to Metal/HLSL/GLSL. I decided to get rid of the SPIR-V part, and directly compile to Metal/HLSL. The target code generated is much much more clear and it removes an extra step.

About sending CPU data to the shaders; I removed lot of shenanigan code we did to automatically pad our data according to which data layout we were targeting, either std140 or std430. I got rid of all that, and instead pad manually the couple of struct we had. I wrote some strict rules, following std140 and some constraints, like no 3x3 matrices or 3 float vectors, etc. Ah! And I must add, I got rid of OpenGL completely too as well. So we don’t have to take care of std430 anymore, pretty nice ‘cause d3d11 and later d3d12 follows std140 too!

Our asset system was a mess, we were using polymorphism to declare one struct for all our asset type, then compile time code to do all the necessary branching. I removed that, because we have 3-4 types of asset only, materials, textures, anims and meshes. Right now, it’s a simple hash table per asset type. To access an item, we hash the asset's filepath and that’s it. In the codebase we use Texture_Handle or Mesh_Handle which contains a string name, and its corresponding hash which we hash only once. Later on, I’m going to convert hash table to arrays, and use known index instead, ‘cause it’s even better than a Hash Table to fetch item there.

Finally, I reviewed all textures and materials. And decided to put some constraints, now material textures shared the same 1024 x 1024 size and pixel format. At runtime, I upload all those textures into a giant array on the GPU, and use an index to retrieve the one we want. Because meshes have all needed textures information through their instance data, it allows me to batch even more meshes together, and in terms of feeling, I just bind one texture instead, pretty neat!

–––
Don't hesitate to reach out on bluesky or via twitter.