So now we’re all on the same page (or at least within a few pages of each other), let’s see what Direct3D 12 brings to the table of low level API’s. Well – the best we can anyway, as we’ve just said on the previous page we’ll not being seeing DX12 emerge until late next year (sometime Holiday 2015 is the best date we’ve been given thus far). Microsoft aren’t being too open mouthed regarding the details of DX12. My own probing with questions were met with responses from their developer blogs when I asked about un-announced features (can’t blame me for trying – right?).
On the surface, there’s few who’d deny that Microsoft’s new love child is just a little bit similar to AMD’s own Mantle technology. I did probe AMD regarding this, but they assured me during interview that DX12 is Microsoft’s own creation. But they certainly didn’t hesitate to mention they had certainly inspired some of the decisions with Mantle. Regardless, when two companies set out to accomplish similar goals, similar results (particularly since both API’s are still in development) are inevitable.
Command Lists and Command Buffers in D3D 12
In this case, the primary goal is to reduce the CPU overheads and improve multi-threading on PC. It’s little doubt that the largest source of these overhead woes come from the Command Buffers and Command Lists, as the CPU assembles them for the GPU to process. DX12 is going to overhaul this totally, by giving the developers the ability to handle this task, creating their own command lists. This means that the developer in question can more easily take advantage of multi-core CPU’s and spread the work load over the CPU cores. If you’re hoping to get a sneak peak on what this looks like, you’ll be disappointed as right now we don’t have any sneak peak or comments on how this will work or function.
To provide some clarity on what Command Buffers and Lists are, let’s give you a brief overview. GPU’s are “In-Order” processors, therefore you must ensure there’s no corruption in this data – it must come in at the right order. Because GPU’s are a shared resource (in other words, multiple application use this hardware) it’s important that each has its own private bit of memory. Through the Command Buffer you set the GPU to fetch the data and then to process that data.
Microsoft are reporting they will be introducing bundles – which are pretty much similar to what you’d expect given the name. They are ‘bundles’ of Command Buffers which are actually reusable. These bundles of course save CPU time, allowing the bundle to be executed rather than placing emphasis on pushing out a bunch of separate commands. That’s not to say that a single bundle (let’s say call it bundle A) always has to contain the same commands (let’s say the commands for the sake of simplicity are 1+1, 2+2. and 3+3). There can indeed be some minor variance in the command, and so the same bundle A could issue 1+1,2+2, and then 3+3 the first time, but the second time if desire it could be altered to say a 3-3. Microsoft used the example of a character being drawn twice – once with one set of textures, and another with a different set of textures.
Pipeline State Objects
Another key feature will be Direct3d 12’s Pipeline State Object (PSO). The previous iteration of D3d allowed and liked the architecture to switch between pipeline states. Most modern GPU’s simply don’t operate this way. The driver simply cannot resolve something until its “draw time” and therefore this increases CPU overhead, vastly reducing the amount of draw calls in a given frame of animation due to no longer having to worry about the hardware state being set.
Once again, the most important thing to remember is that there are thousands of draw calls per frame of animation, and considering that many gamer’s target at least 30 (with PC gamer’s really wanting to play at 60), this is vital for nudging up frame rates. For each and every object in the game’s world, it must have its textures and effects calculated. So draw calls as basically the process of asking for an object to be drawn on screen.
DX12 Descriptor Heaps and Tables
From Microsoft’s own notes, it would appear that we’ll be seeing the introduction of bindless resources in DX12. They’re not calling them “bindless” however. But effectively with these changes, D3D 12 gains the ability to perform bindless operations. This will both expand the resources available to shader programs, and even outright dynamic indexing of resources.
Conservative Rasterization
Its purpose is to help with Object-Culling and collision / hit detection. Nvidia have already released a large technical description of this in their GPUGems blog. “The solution to both of these problems can be seen as a modification of the polygon before the rasterization process. Overestimated conservative rasterization can be seen as the image-processing operation dilation of the polygon by the pixel cell. Similarly, underestimated conservative rasterization is the erosion of the polygon by the pixel cell. Therefore, we transform the rectangle-polygon overlap test of conservative rasterization into a point-in-region test.”
It’s worthy to note there were a few others shiny things to note – including compressed resources. JPEG was specifically mentioned, along with ASTC-LDR. It’s available currently for OpenGL – and so its inclusion isn’t all that thought provoking. From the perspective of JPEG’s however, my own eye brow did raise a little. JPEG compression isn’t a fixed ratio (anyone with a lick of Photoshop experience will tell you that). With GPU’s loving predictable data so much, this is certainly a strange decision. Likely we’ll not be seeing ‘current’ GPU’s (such as the GCN 1.1 or Keplar) architecture take an advantage of this, but Microsoft clearly have a little knowledge of what GPU vendors are working on. As much as I hate to say it – all we can do is watch this space for more information.