It’s finally happened, AMD has launched the second generation Epyc processors, known as Rome. A month after their Ryzen desktop counterparts, AMD has unleashed the Zen 2 equipped processors on the data center.
The chips are known as the 7xx2 series and are the world’s first 7nm server CPUs, and demolish the first generation of Epyc (known as Naples) processors in value and performance.
According to AMD’s Scott Aylor “…when we take and bend the curve on performance, generation on generation, in terms of doubling performance, and having that level of lead over and above our competitor, you can imagine why customers and partners are super excited.”
The processors are built using nine chiplets, with 8 of those being the CPU chiplets. These chiplets are referred to as CCD and contain 8 processors, split into two CCXs (each CCX naturally houses 4 processors and 16MB L3 cache).
The last chiplet being the IO die, which is manufactured on 14nm process. This die is more feature-packed than the 12nm die found on desktop Ryzen processors.
According to AMD, we see Rome double the performance per socket (compared to Naples) thanks to doubling the core count, and 4x peak flops (thanks to the doubled floating point performance of Zen 2).
In the above diagram comparing the first generation Epyc processors against Rome, you can easily see the difference in design. With the first generation, each of the dies connects to one another using Infinity Fabric (IF).
With Rome, this is totally changed with each of the CCDs using Infinity Fabric to instead ‘speak’ via the IO die, and there’s IF links for inter-chip communication.
Naturally, Zen 2 has a large performance uptick compared to the older Zen architectures (with Naples being based on the original Zen architecture). This means that we’ll see about a 15 percent IPC gain for Zen 2, and 15 percent… well, it’s a lot.
As you probably guessed, we also see the first PCIe Gen 4 server implementation too, with up to 128 lanes (though naturally its also PCIe Gen 3 compatible too). You can connect up to 32 SATA or NVMe devices to the system. Actually, technically there are 129 lanes, as the 129th is for BMC (the server control chip).
There’s also 8 channel memory support (compared to the 6-channel of current Intel chips) supports DIMMS up to 3200MHz, with 2 DIMM/s channel capacity up to 4TB per socket. Speaking of memory, if you did the math from earlier on how much L3 cache is in each CCD (32MB) and multiplied it by 8, you’d realize that we’re looking at 256MB L3 cache… crazy.
To put the above numbers into some context, Rome has a peak PCIe Gen 4 bandwidth of 512 GB/s, and as for memory, we are looking at 64GB per core, while the 8 memory channels (assuming DDR4 3200MHz) provides 204 GB/s DRAM bandwidth.
There’s also a plethora of security measures in the hardware, including Secure Memory Encryption, Secure Encrypted Virtualization and even hardware optimized security features in hardware for Spectre V2 and also Meltdown.
This isn’t just a case of more CPU cores equals good, but instead more of everything equals great. Better IPC, much more bandwidth from a huge pool of DRAM, doubling the cores of the previous generation, ridiculous IO bandwidth and all at a reasonable price, too.
So, how’s performance then? Well, I think you can guess that it’s not exactly a slouch. HPE announced that it destroyed several world records thanks to AMD’s Rome. On the database virtualization side, the DL325 and their DL385 crushed the old record by 321%. In power efficiency (I know, I know… it doesn’t sound as exciting, but power efficiency is super important in data-centers), the Rome equipped system beat the previous record by a staggering 28%.
In the above benchmark conducted by AnAndTech you can clearly see the lead Rome has over both its predecessor and also Intel’s Dual Xeon 8280 in this Java Benchmark for huge pages.
ServerTheHome benchmarked the number of Linux Kernel Compiles per hour and once again, I feel the results speak for themselves.
For the princely sum of $6950 USD, you can pick up the Epyc 7742 processor, with 64 cores (128 threads), a base clock of 2.25GHz and a boost of 3.4 GHz, 256MB L3 cache and finally, a TDP of 225W.
Over the next year or two, Intel will be under a lot of pressure in the data center, and AMD are likely to gobble up a lot of customers. Google, Dell and HPE are just three of the many companies who’ve already said that they want to grab processors from the company.
These are definitely interesting times.