Last year, AMD debuted their highly anticipated RDNA architecture to the world, with the highest-end SKU being the Radeon RX 5700 XT, powered by the Navi 10 silicon. This graphics card was impressive in its own right, measuring a mere 251mm2 and built using TSMC’s 7nm process. RDNA represented a significant design shift for AMD, fundamentally changing how the GPU functions and in doing so, AMD claims up to a 50 percent improvement in performance for the same power consumption.
But, despite all of this – the efficient nature of Nvidia’s architecture has largely kept GeForce ahead of AMD in gaming (we’ll stick to gaming for this discussion, I will discuss data center potentially later as I’m asked by my source to not disclose some details as of now). Nvidia learned these lessons well – and if you were a gamer back when Nvidia launched the GTX 480 and the other graphics cards in the Fermi line up you will remember hasn’t always been the case.
Owners of the GeForce GTX 480 found out that yes – the card was fast. But it did so at less than ideal noise levels thanks to high power consumption and heat generation. And yet, you will recall that despite keeping to the same process size (40nm), Nvidia managed to turn Fermi around with a refresh.
The GeForce GTX 580 was an improvement over its predecessor in just about every single way – despite having a largely similar architecture. For those curious of Nvidia’s approach, AnAndTech did a great breakdown as to the differences – but in short, Nvidia tweaked the actual design of the card to improve performance in some situations yes, up went clock speeds.
But they also went back to the drawing board and started to tinker with the GPU at a transistor level, optimizing the GPU by using fewer ‘leaky’ transistors and focusing on monitoring the GPUs power consumption and temperatures – as well as beef up the cooling solution on the GPU.
Despite this GPU launching back in November of 2010, Nvidia (in general) has typically continued to evolve with the subsequent generations of GeForce cards, and despite the odd miss here or there, but in general – Nvidia’s architecture is extremely efficient. Fast forward to Pascal. One of the most infamous slides for the Pascal architecture was “crafted for speed”.
Sure, there was a plethora of tinkering under the hood Nvidia did for Pascal – such as better data compression and pre-emption, and I’m not dismissing this. But the pure clock speeds of Pascal are what we’re largely focused on here. The GTX 980 and its 2048 Maxwell CUDA cores ran at 1217MHz, built on the 28nm process from TSMC.
Nvidia cranked this up to 2560 Pascal CUDA cores boosting up to 1733MHz (though often could go much higher) – albeit on the 16nm FinFet process from TSMC. Nvidia optimized their chips yet again, drastically improving the performance of the GPU and reworking the design of the chip. As a result – “Pascal – Crafted for Speed”.
Sure – the shrink of the process certainly was incredibly important in this achievement (indeed – you can see this in the slide above). But, a good portion of Nvidia’s accomplishments were also down to optimization at the silicon level.
So what does any of this have to do with AMD? Well, according to one of my sources the optimization works of Suzanne Plummer and her team over at Radeon Technology group is now paying dividends and the second generation of RDNA is a drastic improvement over the first because of this.
Back in 2018 Suzanne Plummer provided a few hints to this, “A lot of what we did in Zen was trying to push well beyond what we thought we could do,” says Plummer, “and I think that is something we’re trying to do in the graphics space as well to make a bigger leap forward.
“We’ve pulled in some of the expertise from the microprocessor cores team into the graphics team, kinda helping with our methodology, and improving our frequency and our performance and power. And just taking the best that we have already developed in-house and trying to make sure that we’re using the same improvements across the company.”
According to one of my sources, this optimization in design philosophy is now deeply ingrained into upcoming Radeon Graphics products. Suzanne and her team essentially moved into RTG just after the completion of the Zen 1 design and has been hard at work tinkering with the silicon to optimize it.
Now, remember – when designing a chip, it’s not an overnight process – and depending upon complexities of the design and challenges, 2 – 4 years isn’t an uncommon time period. So we’re now looking at the beginning of her work to take shape.
Now, the first generation of RDNA did enjoy some of those performance improvements and optimizations – there’s an interesting slide which dates back to the launch of Navi 10. And you can see how AMD is claiming that we can see numerous enhancements thanks to the lessons learnt in Zen.
One thing my source can’t tell me yet is exactly what has changed in the optimization strategy from the first generation of RDNA to RDNA 2.
My source tells me that Renoir and its GPU is a hint of what we can expect, where the designers took an established GPU architecture (Vega) but improved it significantly over Picasso. Robert Hallock from AMD recently pointed out how different the Vega CU is in Renoir – a 59% performance uplift per CU versus the Vega CU’s residing inside the Picasso architecture.
Just how Nvidia has a team dedicated to optimizing the silicon to ensure the best performance at specific power consumption, AMD now has a similar team too. This isn’t something you can necessarily heavily automate, and can be quite grueling work – but, it will be of critical importance to face Nvidia.
My source isn’t sure how much of the performance uplift from RDNA 2 is from circuit optimization and how much is architecture. We do know that RDNA 2 features Hybrid Ray Tracing (and indeed, AMD themselves has confirmed Ray Tracing will be in the next-generation cards).
We’ve also seen a hint of what these GPUs will be capable of – with a benchmark easily outperforming the RTX 2080 Ti. The heavily overclocked RTX 2080 Ti is beaten by about 17 percent, but ‘stock’ cards are being thrashed by up to 30 percent by AMD’s high-end Navi silicon, despite it being an Engineering Sample.
From what I am able to piece together now from multiple sources, the second generation of RDNA will feature improvements in performance because of the architecture, considerably more Compute Units compared to RDNA 1 and also likely scale much better with clock frequency too.
From what another source told me, we will likely not see these cards launch until Summer (although no specific date is provided) and will be the RX 6000 series. This gives AMD a bit more time to bring up the drivers and the design – and who knows just what state of the silicon we saw being benchmarked was.
This would also tie into what we seem to be learning about the Playstation 5 and its clocks. I’ve personally seen internal testing documentation of the Playstation 5 running at 2GHz for ‘native mode’ (ie, for PS5 software). Given the power consumption of a 40 CU RX 5700 hits the low 200W mark when clocked to 2GHz, you can see how important this optimization will be.
Long story short – the next-generation RDNA from AMD (and other future Navi cards with the potential exception of say Navi 12) will all benefit from drastically improved efficiency – and with any luck, we will also see higher clock frequencies on the GPU and lower heat output too.
I recently covered that AMD is internally working on RDNA 3 already, which comes as little surprise given that they seem to be taking so many design influences from their CPUs. Multiple teams work on Zen cores and we know AMD is working on at the very least Zen 5 right now, using a leapfrog design approach. So team A is finishing Zen 3 right now but will be providing information to teams B and C who’re designing Zen 4 and 5.
This is drastically simplified of course, but long story short – it means that every generation improves over the previous one and continues to build on the lessons learned and difficulties encountered by the other teams.
So then – Navi 21 and Navi 23 (or as the source who tells me the cards will launch in Summer calls them, the Radeon RX 6000 series), will be the cards which benefit from the lessons of the first generation of Navi. They will also sport the knowledge learned from helping both Sony and Microsoft design their silicon too.
With reports that ‘Big Navi’ from AMD is over 500mm2, and massively faster than the RX 5700 XT it’s an exciting time to be a gamer. Will it be enough to take on the massive might of Nvidia’s RTX 30 series when they launch? We can only wait and see.
With news and confirmation from Lisa Su herself that we’ll see the high-performance Navi parts this year, it will be extremely interesting to watch.