On March 2nd of next year, it will have been three years since the launch of the first-gen Ryzen processors and with their launch, AMD caused significant disruption to what consumers expect from mainstream processors. It’s easy to focus in on the core count increase, doubling the core count over Intel’s then flagship offerings, but honestly, AMD bought a lot more to the table other than more threads.
The most obvious thing would be competition – and competition is always a good thing, and frankly, I hope Intel can get back on track quickly, as it only benefits us as customers (more on that later on). The other big thing AMD provided is a commitment to a platform in the shape of AM4. If you’re the owner of an X370 motherboard for example with a Ryzen 7 1700X, you can quite happily plop in a Ryzen 7 2700X, a Ryzen 7 3700X (so in other words, staying on eight cores), or go for gold and plop in the 16-core monstrosity known as the Ryzen 9 3950X.
AMD’s Ryzen 3000 series is very impressive, showing awesome performance. You can check out our budget Ryzen 7 3700X and RX 5700 build here.
But, this article isn’t meant to be a history lesson (that’s for another time), but I did want to set the stage a little. Now, in the closing weeks of 2019, we’re already seeing a lot of rumors for AMD’s Zen 3 architecture, and also its accompanying processor series such as Ryzen 4000, and so our mission here is to go deeper into what’s been rumored, officially announced by AMD and of course, our own exclusive leaks to figure out what we can expect from AMD’s next-gen architecture. We’ll be focusing primarily on the Ryzen 4000 series here, though we will also touch a little on the server CPUs known as Milan, which of course will be the next-generation Epyc processors replacing the current Rome CPUs.
Let’s start things out with AMD’s own official information for Zen 3 as well, it’s official. The first big piece of information was the now infamous leaks thanks to a video that was accidentally made public on YouTube which detailed some of the Zen 3 enhancements and was aimed squarely at folks who work for big data corporations. There were two highly interesting slides I’d like to bring your attention too, the first of which is a roadmap.
Looking at the below roadmap, you’ll notice 64/2x which is a reference to the number of physical cores on the processor and threads, so we know that SMT-2 is confirmed. The Zen 3 core also sports support for PCIe Gen 3.0 and Gen 4, and of course, also confirms that the power envelope is identical.
The next slide is in regards to the architecture itself, as it provides a very high-level overview of a Zen 3 chiplet. Unfortunately, there’s not much we can learn other than the level 3 cache. You’ll notice a rather significant difference between the two, with what looks like the whole Level 3 cache being unified into one contiguous 32MB chunk. This brings up a lot of questions as to what’s happening with the chiplet and CCXs, as with Zen 2 each chiplet was made of what essentially were two CCXs (each CCX contained 4 cores and 16MB level 3 cache).
You will also notice it reads as “32+ MB L3” which is potentially a hint we could see AMD nudge the amount of Level 3 cache up, don’t forget that the physical size L3 cache takes up on the current chiplets for Zen 2 is not insignificant, with the physical area of L3 cache being larger than the CPU cores (we’ll talk more about that later).
AMD’s own employees have also provided several statements in regards to the performance of Zen 3 will be in-line with a new architecture and not a small enhancement. Speaking with the Street Forrest Norrod and they said the following “When asked about what kind of performance gain Milan’s CPU core microarchitecture, which is known as Zen 3, will deliver relative to the Zen 2 microarchitecture that Rome relies on in terms of instructions processed per CPU clock cycle (IPC), Norrod observed that — unlike Zen 2, which was more of an evolution of the Zen microarchitecture that powers first-gen Epyc CPUs — Zen 3 will be based on a completely new architecture.
Norrod did qualify his remarks by pointing out that Zen 2 delivered a bigger IPC gain than what’s normal for an evolutionary upgrade — AMD has said it’s about 15% on average — since it implemented some ideas that AMD originally had for Zen but had to leave on the cutting board. However, he also asserted that Zen 3 will deliver performance gains “right in line with what you would expect from an entirely new architecture.”
My sources and Other Leaks
I’ll say right off the bat – that as with all leaks some or all of this information could turn out to be inaccurate, but what’s very interesting to me is that several sources have told me things which closely follow what another and totally unrelated source also told me in private.
First of all, I was told that there is no change in core counts for either Epyc (Milan) or Ryzen 4000, and so assuming the same naming scheme holds try for Ryzen 4000, we should see a Ryzen 9 4950X which sports 16 Cores and 32 threads thanks to SMT. Several sources have told me this information, but apparently it’s due to a number of factors – one is quite honestly, AMD doesn’t feel they need to bump up core counts to counter Comet Lake or Rocket Lake (I’ll get to that more soon) but another factor is platform limitations. You need to physically feed the cores with data, so memory bandwidth is a big limitation, and potentially TDP and power (though I was mostly told about bandwidth). This is said to be a big factor for Epyc too.
One way around that with Epyc would be to add-in HBM2 or something similar, and there have been some leaks that Milan will feature up to 15 dies per chip, but from everything I’ve been told this isn’t the case.
Also, there was a new roadmap which leaked, which basically backs up most of the things in the above innovator highlights roadmap, but also mentions that there’s 9 MCM chips on Milan (so 9 chiplets). 8 of these are CPU chiplets (8 core per chiplet for the 64 cores), and also a ninth chip which is the IO die. In this very same roadmap, there’s also seemingly yet further confirmation that a CCX is now 8 CPU cores, and not 4 cores.
Regardless though, at least for Ryzen 4000, it’s almost certain that we will NOT see an increase in the number of cores, and AMD will essentially stick with the same formula they’ve got now.
So for example, a 4950X will be 16 cores, a 4900X will be 12 cores… and so on down the product stack.
To add to this, there’s a fascinating article on Toms Hardware too, where Mark Papermaster goes into detail for several things, including core counts for future processors “I don’t see in the mainstream space any imminent barrier, and here’s why: It’s just a catch-up time for software to leverage the multi-core approach,” Papermaster said. “But we’re over that hurdle, now more and more applications can take advantage of multi-core and multi-threading.[…]”
“In the near term, I don’t see a saturation point for cores. You have to be very thoughtful when you add cores because you don’t want to add it before the application can take advantage of it. As long as you keep that balance, I think we’ll continue to see that trend.”
Indeed, another source told me that while the core counts for Ryzen 4000 and its accompanying X670 platform will remain static (for the above reasons), Ryzen 5000 and Zen 4 based CPUs are a totally different matter, and given they will support new technology such as DDR5, AM5 (yes, that’s what I was told the new socket is called… hardly a surprise but still) will feature more processor cores, providing AMD believes it’s necessary to fight off Intel.
This is definitely going to be a marketing decision, and is similar to how the Ryzen 9 3950X was handled. I was told early on with my Ryzen 3000 leaks that an issue AMD had was how they wanted to market the 16-core CPU without totally destroying the need for the third-gen ThreadRippers.
So what about clock speeds then?
TSMC will be providing AMD’s its 7nm+ process, which is similar to 7nm but is enhanced with EUV. The basic gist is that we can see about a 10 – 20 percent improvement for a variety of criteria, including density and power consumption. So, that means 5GHZ for all 16-cores finally right? The dream is alive! Well, probably not really.
There have been several reports that the ES (Engineering Sample) Zen 3 is hitting about 100 – 200MHz higher clock speeds compared to the equivalent Zen 2 silicon, and several of my sources have said similar to what Zoo over at ChipHell said. Indeed, recently a new source (who is very well connected) confirmed what my other sources had told me, about 100 – 200MHz over Zen 2. But this source also added some much-needed clarity.
This source told me that the ES which was about 100 – 200MHz higher was actually Milan (the server processors) and NOT the chiplets designed for Ryzen 4000. If you’re wondering why this is important, it’s because of the frequency curve. The jist here is that the closer you get to the outer limits of what’s possible for silicon to run at, the more voltage you need to maintain a higher stable frequency. I want to keep this relatively simple for this article as we have a lot more to get through, but it’s a lot easier to nudge the curve of the clock curve up near the middle / upper-middle compared to the edge.
With Rome, the base clock frequency for the 64 core 7H12 is 2.6GHz, with a boost max of 3.3GHz. The 7302 meanwhile, is a 16-core SKU with a base of 3GHz and boosts up to 3.3GHz. Of course, these SKUs are running at much much lower clock speeds compared to a Ryzen 9 3950X, which has a 3.5GHz base clock, and a 4.7GHz boost clock.
So, Milan is receiving about 100-200MHz increase in clock speed, but Ryzen likely won’t get that. I was told it’s not impossible to see a clock speed bump for Ryzen 4000, but we’re likely going to see SKUs stay at roughly the same clock frequency, or maybe 50-100MHz faster compared to their Ryzen 3000 counter-parts.
What about IPC?
While 5GHz is important because it’s a great PR number (it sounds good to say 5GHz 16-core, let’s just be honest), IPC is just as important as the clock speed. Instructions Per Clock is very difficult to measure because different applications use different parts of the processor, so you might have an application that has a lot of Integer based instructions, or another which really relies heavily on AVX instructions or basic floating-point. There might be others which are less latency sensitive, others which really love lots of processor cores – and so on, and so on.
With this all said, I have received several sources who have told me information regarding IPC information. The first I heard about this was in early October of this year, where a source backed up the information on the Chinese forum ChipHell, and confirmed 100-200MHz higher clock speeds, and also told me that the average IPC gains for Zen 3 were “greater than 8 percent”.
All went quiet until BitsnChips provided me insight that the Zen 3 performance increase was closer to 15 – 17 percent on average, and also that the Floating Point performance was significantly faster (I’ll get more into that soon). But then, most recently two new sources provided clarification to the IPC gains.
The basic message is that the Integer performance of Zen 3 was on average was about 10 – 12 percent faster than Zen 2, but FP performance was up to 50 percent faster. On the ‘average’ workload though, this would translate into about a 17 percent increase in performance compared to the Zen 2. In theory, this would mean that if you compare Zen 3 to the original Zen architecture, you would see about a 30 – 40 percent IPC gain – which is quite frankly, incredibly impressive.
Unfortunately, what exactly had been changed on the Zen 3 processors is still a bit of a mystery. According to BitsnChips, he said that we’re looking at a 40 percent increase in bandwidth of Level 1 cache on Zen 3. I was also told by him we’d possibly see an increase in Level 3 cache size by ‘at least’ 50 percent. Another source has told me they’ve heard similar, and I was also told that Zen 3 is significantly enhanced in terms of latency across the chip, possibly with improvements to the IO die as well, but unfortunately, I don’t have solid info as to what that is.
So here’s a summary of Zen 3:
Zen 3 is backward compatible with the current platforms, so still uses DDR4 / PCIE 4 and retains same core count as previous generations
IPC gains are 10 – 12 percent for Int and 50 percent for FP, average IPC is 17 percent.
Clock speeds – likely 50 – 100 MHz for Ryzen at best, though Epyc (Milan) might be about 200MHz faster.
Potentially higher sustained boost clocks when more cores are under load
Level 3 cache is unified for Zen 3 CCX, and potentially much larger (50%?). L1 cache is 40 percent greater bandwidth compared to Zen 2.
TDP is similar to the current generation of products.
Retains SMT-2 support.
General architectural improvements (many haven’t been leaked or disclosed yet)
AVX-256, and not AVX-512 support.
Platform is X670 (details of the platform are sketchy right now)
Release date is Q4, 2020
The Ryzen 5000 series can certainly increase core count (as we discussed above), and will also sport shiny new technology such as DDR5. But one big rumor that’s persisted since even before Zen 2 launched was SMT-4 being included in future Zen architecture. Indeed, there were several reports it was a shoe-in for Zen 3, and of course we now know this just isn’t the case. So, does that mean it’s never going to happen then?
Mark Papermaster (in the same interview with Toms Hardware) did speak about SMT-4 briefly, but his answers were pretty vague.
“In general, you have to look at simultaneous multi-threading (SMT): There are applications that can benefit from it, and there are applications that can’t. Just look at the PC space today, many people actually don’t enable SMT, many people do. SMT4, clearly there are some workloads that benefit from it, but there are many others that it wouldn’t even be deployed. It’s been around in the industry for a while, so it’s not a new technology concept at all. It’s been deployed in servers; certain server vendors have had this for some time, really it’s just a matter of when certain workloads can take advantage of it.”
As for what my sources have told me – I’ve had two sources confirm AMD are looking ‘into’ SMT-4, but there hasn’t been any specific information I’ve been provided for AMD’s conclusions or which (if any) version of Zen SMT-4 will be included. SMT-4 isn’t an automatic win for all applications (as Papermaster pointed out above), and the architecture itself would need sufficient amounts of cache and other resources to make it viable.
For gamers, and the average home user, SMT-4 isn’t going to really be of any benefit at all. I suspect games will become more CPU dependant in a few years (because of the next-generation consoles, keep an eye out for an upcoming analysis soon on them). But in the here and now, as we have shown in our recent core count and API testing and article video, most AAA games don’t do an amazing job leveraging a ton of threads, even with Vulkan or DX12.
Indeed, from the comments from AMD and from my sources, if we do see it, I would say at best it’ll be for ThreadRipper, though more realistically it’ll remain in the realms of high-end workstations and servers with specific Epyc chips. It would be yet another dimension AMD can differentiate its Epyc and ThreadRipper product stack. This isn’t to say it’ll happen, and it certainly isn’t confirmed for Zen 3 or Zen 4 from what I know.
Hopefully you’ve enjoyed the article, and if you do be sure to checkout the video too and of course subscribe to our YouTube channel for much more content!
AMD Radeon RX 5700 Series
AMD Ryzen 3000 Series
MSI MEG X570 ACE –
DeepCool Matrexx 70 –