There are rumors floating about that AMD are planning to release an APU containing no fewer than 16 Zen CPU cores, 16 GB HBM and a built in Greenland GCN based GPU.
This supposed leaked slide focuses on the highest end part, which is targeted to both servers and High Performance PC systems – think render farms or for scientific computing. If it’s true, it highlights interesting possibilities with both Zen CPU’s and the future of AMD’s APUs too.
Let’s focus on just the CPU side (Zen) of the supposed APU for a moment. We certainly don’t have a great number of details of the Zen architecture, but from what we know (from a combination of official murmurings, leaks and rumors) the CPU’s are a nice step up in performance from AMD’s current X86 architecture, and with any luck be more competitive with Intel, particularly on a performance basis.
Zen isn’t launching until next year (2016), so many details are a bit sketchy, but it’s seeming pretty certain AMD will produce the CPU on a 14nm process, and naturally will feature full DDR4 support and up to 95W.
The biggest departure of the Zen architecture appears to be how it handles multi-threading, switching from Cluster Multithreading to Simultaneous Multithreading – much closer to how Intel does things..
The current CPU range from AMD, Bulldozer, uses CMT (Clustered Multithreading) and at the time AMD’s thought process was that it would provide two integer pipelines but reduce both die space and power consumption. Unfortunately, due to a number of reasons, things didn’t work out quite how AMD had planned, and instead this reduced performance by up to about 20% compared to a standard configuration. According to the slide, 16 Zen cores can run 32 threads (so that’s two threads per chip).
Judging by the slide, AMD are going the modular root with Zen, with each Zen module comprised of four CPU cores, 2MB of level 2 cache and a smattering of L3 cache too. This is reminiscent of how say the GCN architecture works, with each GPU containing several Compute Units say 20, or 32 units) and if you were to break down each CU, you’ll find 64 shaders (ALUs), which are grouped into 4 SIMD’s along with caches and other components.
With this particular Zen configuration, you’re looking at 512KB cache per core (8MB total) and an additional 32MB level 3 cache which is shared between the CPUs.
Taking a broader view of the rest of the APU, it will feature four channel DDR4 support, PCIe3, and the GPU is paired with High Bandwidth Memory. This HBM provides a rather blistering 512GB/s of bandwidth,
Supposedly, the CPU cores speak to the GPU using a new fangled communications channel known as Coherent Fabric, which removes the associated latency of PCIe. Basically, the APU is going to be pushing towards a high performance HSA part. We’ve precious few details on exactly how Coherent Fabric operates, but I do wonder if it’s a little closer to how say ARM handles things in terms of cache and general coherency.
There’s not a ton of information on the actual GPU, such as the number of shaders (ALU) that it supports, and it’s probable that we’ll not be seeing the same number that’s featured in a high end FirePro (or the desktop products)… but that’s a guess. The GPU will support double precision compute, but at 1/2 rate. It’ll also be the next generation GreenLand based architecture too.
As we mentioned above, the part supposedly fully supports DDR4, and the quad memory controller allows it to handle 256 GB memory per channel, meaning a total of 1024 GB of DDR4 RAM running at a speed of 3200MHZ. It goes without saying the average photoshop user won’t need anywhere near that RAM, but high end servers or scientific computing tasks gobble up memory like crazy.
There is some skepticism on if this product would be possible for AMD to actually even produce, particularly given that their supposed time frame isn’t 3 years from now, but a year to a year and a half. Sure, we know that AMD are soon to be releasing the R9 300 series, and its top of the line model (the R9 390X) will feature HBM, but rumors suggest that’ll be between 4 and 8 GB, not 16… and that’s not including the hefty amount of space for the CPUs and other parts of the APU.
Nothing the slide shows is impossible – but it would also be a very very impressive feat of engineering. Not only would there be a lot of parts which could potentially go ‘wrong’ (in other words, potential for a lot of dud silicone) but also the TDP requirements seem a little… lower than expected. Of course, the faulty bits of silicone aren’t really a problem – that’s where speed and part binning come in. Regarding the power, it would really also depend how power hungry both Zen and the new GCN architecture end up being.
For the sake of argument – let’s pretend that it’s all real – what does it mean for Gamer’s? Well, it’s just a sign of the direction AMD want to go. They want to provide powerful APU’s which are capable of playing games, but also with the correct feature set (good number of PCIe lanes for instance) to ensure that you can plug in a discrete GPU without any trouble.
DDR4 is already here with some Intel platforms (the 5xxx series), but it’s not quite a standard yet, but that’ll change in the next few years. DDR3 simply doesn’t provide the necessary bandwidth to feed the CPU cores of tomorrow. It’ll be even more interesting when DX12 and Vulkan become a mainstay of PC gaming. Currently DX11 runs in a serial manner, meaning the CPU cores aren’t being so stressed (you’ll get fewer cores at higher usage). Currently testing thus demonstrates in many games, memory speeds don’t make large differences – but with DX12 optimized games, this could change considerably.
Thanks Fud