The Xbox One’s APU has been seen up close and personal, and as expected it’s not only a beast in terms of size, much of the die is taken up by the ESRAM. In this article, we’ll take a closer look at the Xbox One’s Die, specs (including the fabled 47MB of on board RAM) and compare it to the Playstation 4.
Breakdown of the Xbox One’s 47MB of APU on die memory
Microsoft have touted that the Xbox One’s memory configuration for the main APU features 47MB of memory. The question on many peoples lips is how they actually came up with that figure, the below numbers should help to illustrate how the numbers are counted.
32MB eSRAM (that’s what’s taking much of the size)
4MB Level 2 cache for the CPU. That’s 2MB for each of the 2 modules, each module containing 4 AMD Jaguar cores.
512KB Level 1 cache total for the CPU
1MB SHAPE (Shape is the Xbox One’s Audio processor)
512KB Level 2 for the GPU – note that it doesn’t have the PS4’s volatile bit.
768KB LDS (GPU) – LDS stands for Local Data Share
96KB Scalar Data cache (GPU)
3120KB (GPU – 260KB per CU)
All of this together brings you up to the total of 42MB with the addition of Buffers, 2 Geometry engine caches and Redundancy and you’ll get to about 47MB of RAM total. It’s nothing to get super excited about.
Xbox One APU vs PS4 APU Specs
You’ll instantly see that the Playstation 4’s APU (which uses much of the same basic elements) uses the space saved by not having that pesky ESRAM on additional GCN Radeon Cores (Compute Units). Both the Xbox One and the Playstation 4 feature 2 extra CU’s on chip than what is actually active. So for the Playstation 4, it features 20 on chip with 18 being active, and the Xbox One has 12 active with 14 being on chip. This is in case of issues during the manufacturing of the APU (in other words, for the purposes of yields).
The CPU of both machines is pretty much identical. Both are using 8 AMD Jaguar X86-64 cores, capable of one hardware thread each. Both the X1 and PS4’s CPU’s feature the 4MB total level 2 cache for the CPU, and the same level 1 data cache. It’s unknown presently what the PS4’s CPU clock speed is – most believe it’s slightly slower than that of the Xbox One, 1.6GHZ vs the 1.75GHZ of the X1 CPU. As you would expect, both CPU’s feature the same Out of Order Execution and similar programming processes.
The Major difference lies in the GPU. The Xbox One’s die size is heavily used by the ESRAM, with the chunks of space being eaten into by such a large margin that Microsoft had little choice but to reduce the GCN cores on the final package. This decision has of course drastically impacted the raw TFLOP performance of the Xbox One. The Xbox One’s GPU manages 1.32TFLOPS, vs the Playstation 4’s 1.84TFLOPS of computing power. This is despite the Xbox One enjoying 53MHZ higher clock speed too. This speed increase is simply unable to map up the gap of the ‘missing’ 6 GCN compute Units.
The Playstation 4’s APU has had several other changes – in particular the GPU has enjoyed attention. Not only has the Playstation 4’s Level 2 cache for the GPU sporting volatile bit (which allows you to selectively delete / modify a line of code in the cache without needing to affect those around it) but also much improved ACE (Asynchronous Compute Engines). This allows the Playstation 4’s GPU to feature 8 ACE with a total of 64 queues, while the Xbox One manages 2 ACE with only 16 queues. Click here for more info on the ACE / compute structure. Meanwhile, the additional GPU power of the PS4 features doubles the ROPS (32 vs 16) and 72 Texture Units vs the Xbox One’s 48 Texture Units.
It won’t be easy to make the gap up with the Xbox One’s ESRAM and lack of GPU grunt. This is a case of Sony taking a gamble that they’d have sufficient GDDR5 RAM in their system, they could have easily ended up with a 4GB of GDDR5 system. This was almost the case too until very near launch. Microsoft could have gone with a daughter die option, very similar to the Xbox 360. This would have cost more money to produce, but would have allowed Microsoft to feature more Compute Units (say 14 or 16) and a larger amount of ESRAM (say 64 or 128MB) which would no doubt have helped the machine.
Thanks to chipworks for the images