We all know that in just a few short months Sony will be releasing their upgraded Playstation 4, known as the Pro to the world, but as the release date gets closer, we start to understand that there’s more to the console that just the TFLOPS and Gigabytes per second.
Regularly viewers might recall I (and several other technology websites) picked up on very key phrasing during the PS4 Pro meeting last month from lead architect Mark Cerny. “Our goal with the PS4 Pro is to deliver high-fidelity graphical experiences. With that in mind, we’ve more than doubled the power of the GPU and adopted many new features from the AMD Polaris architecture as well as several even beyond it,” Mr Cerny proclaimed during his time at the PlayStation meeting.
Cerny has also promised to talk about the numerous ‘tweaks’ to the PS4 Pro’s GPU (the PS meeting primarily discussed raw TFLOPS). But since then, EuroGamer had a chance to try out Mantis Burn Racing, which is rendered at 4K and 60FPS on the PlayStation 4 Pro (no checkerboard rendering shenanigans).
“Of course, we already knew that the Pro graphics core implements a range of new instructions – it was part of the initial leak – but we didn’t really know exactly what they could actually do. As we understand it, with the new enhancements, it’s possible to complete two 16-bit floating point operations in the time taken to complete one on the base PS4 hardware. The end result from the new Radeon technology is the additional throughput required to making Mantis Burn Racing hit its 4K performance target, though significant shader optimisation was required on the part of the developer.”
So, what does this all mean? Well essentially, you’re doubling the amount of work you can perform when a sixteen bit operation is being performed on the PS4 Pro. Now, without making this article technical (we’ll be putting out a follow up PS4 Pro analysis soon), floating point numbers are used for extra precision. Essentially, a 16-bit floating point operations are less ‘precise’ than their 32-bit counter-parts, and so can be executed faster. In this case we’ve got a half-precision floating point, and the GPU can essentially run two of those operations simultaneously.
So, in applications and games, not all tasks will always require 32-bit floating point levels of accuracy, so quite simply, if an operation doesn’t need to be so ‘accurate’ they can simply execute it as a 16-bit float. The below image represents two 128-bit instructions running simultaneously through Zen’s AVX unit, or a single 256-bit if you need a visual example. If you need more info on Zen check our full CPU analysis here.
*EDIT* I’ve had a few people message me on Twitter asking for more information on this. It’s a complex subject – but essentially it lets the hardware complete twice the work in tasks which are 16-bit floats ONLY. What tasks they are depends on the game / application. But it’ll likely be useful in compute where less accuracy is required. A simple example would be compute in lighting, and a developer feels that 16-bit floats on a specific effect are ‘accurate enough’.
Essentially, the developers will be given a choice between ‘accuracy’ and speed.
So, briefly examining the CPU inside the PlayStation 4 Pro, we’re left with the same 8-Core AMD Jaguar, running at 2.1 Ghz (over the 1.6 of the original PS4, so a thirty percent improvement), the same amount of RAM, 8GB is present inside the Pro, but with 218GB/s bandwidth over the 176GB/s of the original, and an additional 512MB of RAM allocated to games from the system’s OS.
Finally, there’s the GPU, which has 4.2TFLOPS of performance, up from the 1.84TFLOPS of the original. But remember, that’s also with an updated architecture, so not only does the PlayStation 4 Pro handle two 16-bit operations simultaneously, but also Polaris (onwards) has better color compression, primitive discard accelerator (which nukes objects early in the rendering cycle if it’s not visible to save performance), better geometry handling and a whole bunch more besides.
It’s worth noting that AMD themselves say that the Polaris architecture also is radically improved at Compute, which should mean that more ‘work’ can be offloaded from the consoles CPU (the Jaguar) and placed on the shoulders of the PS4 Pro’s GPU. There’s also instruction pre-fetching, better L2 cache and Native FP16 and Integer16 support. Likely the latter two were tweaked, and lend their performance for the PS4 Pro’s ability to handle two 16-bit floats simultaneously.
In other words – a TFLOP isn’t always a TFLOP, and so the reality is that while the PS4 Pro and the PS4 share the a GCN architecture as a base, the GCN inside the PS4 Pro is radically more advanced (including the checkerboard rendering).
We’ll be putting more information out over the coming weeks for the PlayStation 4 Pro and it’s 16-bit simultaneous instructions, along with other analysis. There’s another question – can the Scorpio do the same thing? Well, we know considerably less about Microsoft’s Scorpio – but personally, I’d be shocked if it couldn’t do much of what the Pro can.
The biggest limitations of the PS4 Pro is likely to be the RAM, as it’s RAM reserves might cause issues with certain titles at 4K – but ultimately, we can only wait and see how games on the PS4 Pro will stack up versus the Xbox Scorpio. So please be sure to stick with us here at RedGamingTech for more info, analysis and insight!