Improvements in performance alone aren’t enough from one generation to another – they need to be combined with jumps in efficiency too. Space on the die itself is always at a premium, and even with leaps in process technology, you can only do so much with a chip. The larger the chip, the more expensive it is to produce it, naturally other issues crop up too, such as the number of defective chips and actually cooling the darn thing.
Some of these problems – naturally, can be resolved – and we’ll go over some of these approaches in a different video, but Nvidia’s Ampere architecture will likely be the last chip produced by team green without some type of MCM design – at least according to the rumors, for example.
But yes, efficiency is really optimizing the art of optimizing not just the hardware – but the software too. There’s been a ton of advancements in this field, including culling and Hidden surface removal (which aims to only draw ‘stuff’ which would be visible on screen), data and color compression tech for shunting stuff around the GPU – and so on, and so on. But one particular piece of technology has been really pushed recently and that’s VRS, or Variable Rate Shading.
Variable Rate Shading is one of those tech terms which… pretty accurately describes the underlying technology. It ‘shades’ (ie, the process of the GPU drawing ‘stuff’) objects at varying amounts, depending on several criteria – but all with the same underlying goal – reducing the workload on the GPU so that what’s important can have the needed detail, and other areas can be drawn at a reduced quality, but in a way which shouldn’t really impact the scene.
Think of this – if you have an FPS title, are you more concerned by things in the far distant background which are possibly in motion and blurry anyway (such as say cars or whatever), or are you instead concerned by the big monster that is trying to eat your face (and also its a big monster, so blocking most of the view of the distance anyway).
To clear things up – this technology isn’t the same as say, DLSS from Nvidia, which renders the whole scene at a lower resolution internally, then uses AI to upsample it from that resolution. Nvidia ‘trains’ a neural network with a high-resolution reference image with DLSS 2, upsampling lower resolution versions produced in real-time against this higher resolution content. The neural network generates results hundreds, if not thousands of times before eventually coming up with a result which is close to (if not better in some cases) than a native image. This trained data is then able to run on your home GPU (running trained data / learnt behavior is known as inference). This is a small bit update bundled in with your GeForce graphics card drivers. I’ve gone over this previously in another video which I’ll link below.
The next-generation consoles are also pursuing upsampling technology too, though its code which runs instead on the CU of the GPU and from what we understand this will also be applicable to the next generation of AMD GPUs, though how its final form differs versus Nvidia isn’t clear as of the time I’m putting this together.
Either way, Nvidia first introduced VRS in a consumer-grade card with their Turing architecture, with the feature known as NAS (Nvidia Adaptive Shading). We saw Wolfenstein Young Blood both this feature and also DLSS 2 (we’ll go over some results soon). However, this technology isn’t limited to Nvidia’s Turing (and future) architecture and has been baked into DirectX 12 Ultimate. Note – DX12 isn’t a special piece of tech here.
Industry standards are incredibly important, and other APIs can support VRS too. For example, Khronos Group support Variable Rate Shading as well as other crucial DX12 U technology such as Mesh Shaders (which we’ll discuss further in another video). Playstation 5 also does NOT use DX12 (obviously), with Sony designing its own Playstation Shader Language (PSSL) for the console. From what I understand, the next generation Playstation takes a similar approach here.
PS4 uses FreeBSD for its operating system, with two APIs – GNM and GNMX. The latter being a higher-level language, with more work and management (for example memory management) taken care of by the API, with much of the inner workings of the GPU abstracted. GNM is a lower level variant of the API, and designed to code more directly to the hardware. Which route you choose is ‘down to you’ as a developer, with one being easier to work with, but lower potential performance – but developers going for simpler games (such as 2d retro platformers, or even 3d experiences not fully needing to max out everything on the console) are happily able to choose GNMx.
From my digging, the Playstation 5 continues to use FreeBSD, with large customization as before. There are, however, still many of the same fundamentals in place, so developers should be reasonably up to speed quickly. As for Microsoft? I’m hearing good things of what they’ve done with their OS too, and of course essentially its build on a version of the Windows core. With MS, they’ve baked DX12 U into the console, and from what I know developers are pleased with the access they’ve got to the hardware. It seems the company has taken a lot of large steps forward from the days of the original Xbox One. It also goes without saying, there’s the ability to code to the metal (with DX12 U) or a higher-level approach here too, and just like Playstation, it’s going to be down to the developer to choose which approach is best.
Either way, on top of VRS being supported by Turing – Intel now support it (with Gen 11 and later of their IGPUs, and eventually their discrete GPUs too). AMD supports it too with RDNA 2 and later.
So – how much performance can you wrangle using Variable Rate Shading then? Well, as always – it’s not a simple figure you can use. VRS on PC implementation has several settings (which you can choose to push quality or performance) and naturally, the higher you push performance, the higher the frame rate will crank up. With this said, FPS isn’t going to double here.
VRS works by taking parts of the image which are blurred – say Depth of Field, or experiencing some type of motion blur, or otherwise are in the background or some other flag which means the quality loss will not be really perceptible – particularly if you’re in the middle of racing around a track, or trying to not get shot in the face.
Long story short – all of the pixels (and objects) in a scene aren’t equal in value, and therefore intelligently selecting which parts of the image get the most love is vital for the best performance, but also visual quality. You don’t want to put as much work into shading a road sign far off in the distance, that’s heavily blurred anyway, as opposed to say, the road in front of you as an example.
Microsoft describes VRS pretty well “For each pixel in a screen, shaders are called to calculate the color this pixel should be. Shading rate refers to the resolution at which these shaders are called (which is different from the overall screen resolution). A higher shading rate means more visual fidelity, but more GPU cost; a lower shading rate means the opposite: lower visual fidelity that comes at a lower GPU cost.”
With older hardware running traditional effects, this cannot be done – and the scene essentially is all being shaded at the same rate, wasting performance resources which could either go to better frame rate or visual quality to other elements
So then, now we know ‘what’ VRS is – what about the visual quality and is there much of loss with VRS enabled?
Let’s start out with 3dMark first – since it has a built-in tool to test performance. There’s also a very handy feature with this benchmark, and that you can switch VRS on and off as well as different quality settings, and also hold the demo on a specific frame.
Well, what’s the answer then – how does VRS look? There’s definitely a loss of detail when VRS is enabled – even at the higher quality settings, but – its much more noticeable on a static frame where your eyes have the chance to process the drop in quality. We’re looking at VRS Tier 2 here, and when the highest performance mode is enabled, the drop in detail does start to take away from the image. Of course, this type of aggressive optimization isn’t something that you would likely aim for on AAA game – at least for foreground objects not dealing without excessive motion blur. But still, it does prove a rather interesting proof of concept of how scalable VRS is.
How about performance with VRS on 3DMark? We’ll be using an RTX 2080 Ti and a Ryzen 7 3700X motherboard for our testing. I’d also like to spend a moment to thank everyone on Patreon for contributing as your funds directly have helped us purchase this hardware – which naturally means that your support means these videos are possible. If you’d like to contribute you can find our link in the description, or alternatively you can use Amazon affiliate links – or even just subscribe and share the video as it helps us out more than you could know.
Well, 40 to 60 percent improvement with the built-in benchmark, in raw frames per second, the numbers are pretty huge because the scenes aren’t super complex. I will say that in both Tier 1 and Tier 2 VRS, higher resolutions offer a higher performance bump; which isn’t surprising given obviously there are more pixels to work on and we’re also less likely to hit other bottlenecks in the system, such as CPU. The scene itself isn’t very complicated and as you can see here, the scaling is very high – once again, this is definitely a best-case scenario for a VRS like workload.
Let’s switch to Gears Tactics, which is a new entry into the Gears universe. It launched recently with VRS support, and once again we test the game using a Ryzen 7 3700X, and an RTX 2080 Ti – but also throw an RTX 2060 Super into the mix. I’d like to thank MSI for chipping in here with the RTX 2060 Super, so go to their UK social media and give them thanks for me 😛
I’ve decided for this test to give the GPUs some punishment and test the resolution at 1080 and 1440P, and also 4K. The exception is the 2080 Ti, which I didn’t test at 1080P because the card is so fast there’s not too much point. I also pushed all the settings to their highest (ULTRA), and made use of both the Planar and Glossy graphics options, which (in theory) means the GPUs have a serious workout.
Gears have 3 settings for VRS – disabled, on and maximum performance. Once again, turning the settings to maximum performance is a lot more aggressive in cutting visual quality compared to just ‘on’, but depending on your hardware – it might be worth the trade-off.
As for performance, the 2060S raises in performance about ten percent with it ‘on’ at 1440P, sitting you well within the 60FPS comfort zone on average. 1440P on the 2080 Ti doesn’t really need VRS on, unless you really want high frame rates for a high refresh rate display. 4k can’t quite manage to hit 60FPS, though barely in mind – a few frames were CPU bound for the 3700X, and also the 2080 Ti was running at stock, so I’m sure people who go in and crank power limits and clocks would hit 60 with VRS set to performance. I decided to not do this though, as I wanted to represent an out the box experience here.
Finally – we have Wolfenstein Young Blood, which also has DLSS 2 technology with it. I’ve previously tested out DLSS 2 in a previous video, but long story short here, DLSS 2 offers a pretty substantial performance advantage over VRS. You also cannot have both DLSS and VRS enabled, so for those wondering – at least in their current implementation VRS (or Nvidia Adaptive Shading) and DLSS don’t stack.
I ran the built-in benchmark here as the results are consistent. Also as full disclosure, these results I took with a different system, an Intel I9-9900K, and an Asrock motherboard (provided by both companies respectively). We have the same GPUs, but also Zotac has provided a 2070 Super for the purposes of Ray Tracing analysis (coming soon) and also the DLSS coverage (which as I mentioned, is up).
So – Visually, VRS looks really good in this game. A key benefit for Wolfenstein is that you’re running around a lot and its a fast-paced game, meaning that you frequently don’t have time to really analyze background elements – especially as they’re blurred anyway. So visually, at least in my opinion – during actual gameplay, the difference isn’t perceptible. Frame rate wise, the benchmark isn’t the most punishing – but, it also does offer a rather nice improvement with VRS enabled.
There’s several different presets with VRS here – and as usual, the higher you crank the settings, performance goes up. You can also manually tweak things should you so wish.
Closing things out – VRS also has a cousin of sorts (or actually very similar tech) known as Foveated Rendering. I’ve talked about this a lot previously, so I don’t want to focus too much on it (if you already know what Foveated Rendering is, that focus part was a joke).
Foveated Rendering is a technology designed for Virtual Reality and was created to improve the performance of an application. VR apps are way more sensitive to latency and lower frame rate and to keep things simple, you can feel like puking if frame rates are too low and jerky. Foveated Rendering is a tech that tracks the fovea in your eye to see what’s in focus and peripheral vision. Or, to put it another way – it uses tracking to see what your eye is focused on, and renders that part of the screen at higher shading, and the stuff in your peripheral vision it reduces the shading rate on.
Objects in peripheral vision can be seen in less detail in real life, feel free to check this out yourself. LIke have a phone in one hand, with a book to the left (about an equal distance away from your face but slightly apart). Focus on one of them (doesn’t matter which) then while your focus is locked, try and get your brain to make out content on the other. You’ll notice it’s extremely hard to do so.
Foveated Rendering has been spotted in various Sony patents before (along with perhaps one of the most disturbing patent images ever… maybe I just played too much dead space). FR isn’t a unique thing to Sony of course, and Nvidia has been discussing it extensively too – and many other companies including Intel.
The bottom line here – it’s not just about the amount of performance your card can bring to the table, it’s also putting that performance to maximum work. VRS in games will likely evolve over the next few years, and given technology such as AI upsampling is also going to be so important – it’ll be very interesting to see how all of this develops.