If you’re a technology lover, it’s hard to not be interested in AMD’s upcoming next generation processor architecture – Zen, which marks a radical departure from the companies older processor designs found in the current Excavator series. As someone who has been following the PC industry since the mid 90’s (yeah, I was pretty young!), Zen is one of AMD’s most promising products, and if it delivers on the companies goals, could well be the most important CPU for the company since the introduction of the Athlon 64.
Mark Papermaster, AMD’s Chief Technology Officer (CTO) recently conducted an interview to further elaborate on the design goals of Zen, what was different from the companies current processor lineup and what we can expect from the processor.
“We just designed a brand new CPU core, Zen, from the ground up,” said Papermaster. “We actually started this effort in late 2012, so we’ve been working on it for four years. It takes four years to get a brand new x86, high-performance CPU done. We are right on track. It’s a very modern core and very efficient in terms of driving that performance per watt of energy, and it’s very scalable. We also designed it to work very well with accelerators, like our GPUs. You can add more CPUs if you need to get more work done, and you can connect to GPUs, FPGAs, or other accelerators.”
We’ve known for sometime that scalability is AMD’s founding ideas with Zen, meaning we can see a large number of cores for certain usage scenarios (such as servers), or fewer cores operating at a lower clock speed for power sensitive tasks (such as ultra thin laptops and other such devices). We’ve discussed Zen’s basic processor architecture at length over a two part analysis (part 1 | part 2) and how AMD have created Zen in a modular design.
At a basic and simple level, four Zen processor cores (each capable of handling two threads thanks to Simultaneous Multi Threading) are ‘strapped’ together into a single CPU Complex, along with a bunch of level 3 cache. Therefore, Mr. Papermaster is highlighting how easily they can simply add additional CCX’s to the design, and throw that together with a a bunch of GCN (Graphic Core Next) cores to create powerful APU solutions.
“Design is microarchitecture, attacking every element of the execution units, of the cache subsystem, of the scheduling, every aspect to ensure you are removing bottlenecks,” he continued with his interview with SemiEngineering. “We’ve leveraged the new 14nm finFET technology. The scalability you have with finFETs is really quite a large range because it has very little leakage. When you turn off your clocks—when you are not doing active work—you can get very close to nil energy, and leakage is lower than previous technologies. Yet as you turn on your clocks and accelerate your workloads, you get very fast performance per watt.”
The above graphic highlights how Mark’s team has managed some of this (for more on power consumption check out the first part of our Zen analysis), for example: Clock Gating allows the processor to quite literally turn off sections of itself when they’re not doing work, thus saving energy consumption and heat output. Because processor loads generally go up and down all the time (it’s rare you’ll find a CPU at 100 percent usage across all cores) this can be quite a big saving on its own. Micro-Op caching caches decoded processor instructions and therefore the CPU doesn’t need to fetch and decode the same instruction over and over again (assuming it’s still stored).
Mark then touched on the processors bandwidth and caching systems; “You have to look at the demand internally of all of your execution units. You have to look at the amount of bandwidth that you need and how you optimize bandwidth and latency. How big is your pipe feeding those engines? How fast can you move data in and out of those engines? That was the core principle behind the Zen CPU design… You need enough bandwidth and pipes to optimize your latency to ensure you don’t create bottlenecks.”
“We looked at what we could do to speed up both, ensuring no bottlenecks in terms of the execution flow. We’ve improved the micro-op cache, the efficiency of getting those instructions into the pipe. We’ve also made a number of efficiencies in terms of reducing the number of cycles executing though our execution units. In terms of memory and feeding it, we’ve optimized our cache subsystem.”
For Zen, AMD have opted to add 512K of level 2 cache per core which is private, and a shared L3 cache system which is ‘shared’ between the cores. Essentially Core 1 could peak into the Level 3 data of Core 2 to save long trips around the memory system. Coupling this with a generous number of entries for both Integer and Floating points, the processor should (in theory) be able to keep a steady flow of data from the main system RAM (which as you probably know is DDR4) to the various caches and of course, the processors execution cores. Zen also features double the bandwidth (compared to Excavator) on Level 1 and Level 2 caches, and up to 5x the amount of bandwidth for Level 3.
AMD once again touted the 40 percent IPC gains with Zen, with Mark Papermaster quoted as saying: “When Zen comes out in early 2017, it is going to have a 40% improvement. The only way you can get that is to use a combination of every aspect of the design, of feeding the engine, of optimizing the engine itself and improving the throughput to the engine. Those are the three key elements in terms of how you get improvements. Anyone who has been around microprocessors design for a while will say it is not rocket science. They’re right, but those are the levers. It’s about breaking it down into dozens and dozens of specific changes you drive into a design.”
While there’ll be Zen processors for servers and other devices, for the ‘average’ user, the most exciting CPU’s will be available on the AM4 platform, Summit Ridge. Summit Ridge motherboards for desktop have been designed to have up-to-date features such as PCIe 3.0 x16 (up to two slots if you’re using the higher end X370 platform), USB 3.1 and NVMe.
Naturally the performance of Zen is still debatable – such as clock speeds, and of course the pricing too. But assuming AMD can nail good clocks and yields with the 14nm FinFET process, and do so at a good end price for customers, it’ll be an excellent solution to gamer’s who want a break from Intel.
Summit Ridge Zen will be released early next year, and while AM4 motherboards are currently ‘in the wild’ they’re only available to OEM’s officially, but naturally by the time Zen is out we’ll see a good selection of motherboards to cater to overclockers and power users, or folks who want to build SFF (Small Form Factor) systems.
As usual, stick with RedGamingTech for the latest information and insights.
For the rest of the interview checkout semiengineering.com/