It’s no accident we see both Sony and Microsoft ending up with x86 processors for the next generation of consoles. For what reasons have Sony abandoned the Cell architecture – the architecture they championed so much around the PS3’s release?
The previous consoles, the Xbox 360 and Playstation 3 both used versions of PowerPC (sometimes referred too by the technically inclined as PPC) CPU’s. Sony had even made much of its version of the RISC, the Cell.
Before we start talking about the various types of processors, let’s spend a moment to familiarize ourselves with the basics of CPU instructions.
CISC (meaning Complex Instruction Set Computing) and RISC (Reduced Instruction Set Computing) both have their positives and negatives. Instruction Sets is a group of instructions which are part of the chip itself, and is their job to direct and guide the computer through manipulating data that it must process. The AMD Jaguar, amongst other X86 CPU’s (such as those in your PC) are going to be using this.
RISC processors can require more lines of code, and therefore more space in memory to run as it needs to store the “assembly level instructions”. RISC however (because of less space taken up for instructions on-die) requires less space, and can fill this with say more general purpose registers. CISC places an emphasis on the hardware, while software is the focus of RISC. PowerPC and ARM are 2 examples of the types of processors which would be found using RISC.
The differences between these two instruction sets means a vastly different approach for programming applications (and of course, games) for the device in question. RISC based processors have the advantage of often offering better theoretical performance than CISC, but at the cost of ease of development time.
The Cell Processor:
The Playstation 3 used the Cell Processor, and was the combined work of Sony, Toshiba and IBM. Much of the processor was designed by hand (rather than using machine tools) and is pretty unique in terms of design. for its time, it was closer to the vector design of a chip used in a supercomputer. It was a Power architecture (meaning based on a RISC design).
The Cell’s PPE (Power Processor Element) would be the main processor. The operating system and most of the games tasks would run off of this. It comes equipped with 512KB of cache, and is capable of running binaries of either POWER or PowerPC. The PS3’s Cell runs at 3.2GHZ (which is clearly higher than say the AMD’s Jaguar – which even on the Xbox One runs only at 1.75GHZ – about half the speed).
The PPE is a dual threaded (meaning the single core could run 2 hardware threads) and was an “in-order” processor. In order requires less instructions on chip – and less die size. There is a trade off however, performance can be somewhat erratic. This can cause issues because the complier needs to do a lot of fantastic scheduling work. Therefore, good coding and toolsets to make the most of the CPU are essential. It also farms off work to the Cell’s SPE’s. You could in many ways consider it to be the manager – or orchestrator of the Cell.
SPE (Synergistic Processor Element) of the cell are basically mini vector processors. They share the same in-order processing theory as the PPE. To get the most out of the Playstation 3, developers such as Naughty Dog needed to tap into the power of not only the PPE, but also the SPE. They are also SIMD (Single Instruction Multi Data) – if that sounds familiar, that’s because it’s very much similar to modern GPU’s. SIMD lend themselves well to graphical tasks and physics and the like. These type of tasks are exactly what the Playstation 3’s Cell processor was found to done, along with other tasks such as audio as well as of course graphical affects and Anti-Aliasing.
Each of the SPE’s have their own cache, 256KB . Careful management of this cache is essential for pulling great performance out of the SPEs.
No doubt – many thought that Sony would likely go with another version of the Cell processor – but the cell had numerous problems. Not supporting Out of Order Execution (and only in order), high development costs, difficulty in programing, and also the initial cost of the Cell were major issues. Indeed – one of the reasons the Playstation 3 was so costly on launch was because of the high cost of the Cell processor.
Back in the Quakecon conference, John Carmack had mentioned the possibility of an ARM based generation:
“We can imagine something that maybe had 16 ARM cores with a whole bunch of PowerVR graphics cores, and you could have made a pretty good console with that,” he suggested. “I suspect it was the 64-bit not being quite cooked enough on there that really made the difference, and the fact that huge amounts of memory are obviously a big deal for the console platforms now.”
8GB of RAM was pretty much what all developers really wanted for this generation, developers had felt limited for long enough with the 512MB of RAM on the previous generation of systems. They’d quickly fill up 4GB of RAM, which would surely be far less, when you begin to consider the overheads for the larger, more multi-task oriented Operating Systems of the next gen.
I also suspect that it would go against the design plans of both companies. They wanted a simple device, not one with lots of different cores and moving parts. Developers had also told them that they didn’t want tons of hardware threads – 6 to 8 would be ideal. Multi thread programming is still difficult to do, and it would no doubt have made the lives of programmers and developers that much more tricky.
PowerPC was very popular last generation, huh? Not only did the PS3 use Cell (which is based on PowerPC instructions), the Xbox 360 of course used PowerPC too. PowerPC was also very popular for Apple’s macs too, but they too have abandoned the PowerPC and have gone the Intel X86 route.
Cost has been said to be one of Apple’s primary motivators, but I suspect that the idea of being much more compatible with Windows software (with many Macs coming with Windows Partitions) isn’t exactly a problem either. The Cell processor in some ways is very similar to the PowerPC – and indeed the Cell’s PPE is very much different sides of the same coin.
The Xbox 360 last generation had its own processor, and ran a tri core (each with 2 threads – giving 6 total). It was fairly successful, and far easier than the Cell to program for. For games developers, the Xbox 360’s processor was running at 3.2GHZ and offered plenty of power. Sometimes, a thread on the CPU however would be gobbled up purely for something such as processing audio. Audio on the Xbox 360 was a heavy processor intensive application. Unlike the Cell, the PowerPC CPU doesn’t have its little buddies (the SPE’s) to fall back on and tell what to do.
This came with the benefit of being easier to program for. but still wasn’t ideal. One can guess numerous reasons MS decided to abandon this.
I’d imagine wanting to run a version of windows easily would be one (of course, it could have been compiled for the PowerPC architecture though, but would have added a bit of complexity no doubt). I suspect they wanted to reduce the production costs, and go with a solution that was already available.
PowerPC is a RISC CPU, and as we discussed earlier, that means some instructions are simply missing, A programmer can get around that with more code – but for games development and graphics libraries, it is probably not the way they wanted to go.
So – x86-64 then? – Enter AMD Jaguar
The X86-64 (the 64 part meaning it can address over the 4GB of RAM) architecture is firmly established by now, and to many of us requires little introduction. Mark Cerny (lead architect of the PS4) had recounted his struggle convincing Sony that X86 would be the route they needed to go.
“Because of its very long history, the x86 is rather complex. You can run code today that was written for the original chip in 1978. They just kept adding things for [over 30] years. If you read an x86 manual, it takes pages to explain all the different ways that you can move data from one register to another, based on all the additions to the architecture over the years. It’s a bit overwhelming from that perspective. Also, it gets very technical. It’s a CISC architecture rather than RISC architecture. There was definitely a first-party voice that said [the x86] probably couldn’t be used in a games.“
To put that into perspective, that means that if you’re a time traveller from the early 80’s, you’d be able to come to the present and write something in a programming language such as C, and get results. X86-64 also has another key benefit – a lot of different choices which help to reduce the cost of the console.
The Xbox One and Playstation 4 are not the first kids on the X86 block however, for example in the early 2000’s Microsoft released the original Xbox, and it too was also X86. It used a Intel Celeron/Pentium 3 hybrid (running at 733MHZ), and combined that with 64MB of RAM and a Nvidia Geforce Graphics Card.
One of the problems with X86 style processors until recently is that they’ve been fairly power hungry and haven’t given the best power to performance ration available. But with processors such as Intel’s Atom’s and of course, AMD’s Jaguar series (which is an update to the bobcat core) things are starting to change.
The Jaguar CPU inside both machines is comprised of two modules, each holding four cores (meaning a total of eight), each handling one hardware thread. This design is an APU, meaning GPGPU and Asynchronous Computing is easily achievable. This means all on one package (SoC – System on Chip) you’ve got the CPU, GPU (based on the Radeon GCN architecture) and most of the northbridge.
I suspect the primary reason behind most of this is to encourage developers to just port their games over and in the quickest way possible. Sony have reached out to Indie Developers in a big way, and Microsoft aren’t that far behind any longer either. It just makes sense to give developers a lot of memory, easy to use CPU & GPU and let them do what they want to do.
Mark Cerny also spoke about how Sony were much better able to go speak to different hardware vendors, and find one which can match their needs. Sony and Microsoft had different visions for their consoles – but their basic needs were fairly similar. Cheap to mass produce, high yields, powerful, requires low power, produces low heat. This means you’ll be able to design a console which is easy to cool and won’t overheat unless you stick a big fan on it.
The APU design of both mean systems means you’ll be able to use the GPU to help the CPU do a lot of the complex instructions which are better tended to by parallel computing. These instructions can be farmed off to the GPU by the CPU, where they can be executed in a SIMD (Single Instruction Multi Data) fashion.
The PS4 in particular uses HUMA (Heterogeneous Unified Memory Access) to help facilitate this amongst developers. The Xbox One does have a ‘version’ of sorts, known as Coherent Memory.