The next generation wars are heating up. Microsoft’s done the announcement of an announcement thing, letting us know that they’ll be ready to show the world the next Xbox on May 21. We’ve heard a fair deal about Sony’s PlayStation 4 as well, including details of its architecture. Speaking to Gamasutra, Sony’s PlayStation4 lead architect, Mark Cerny, has gone in to a little more depth about the design choices that were made in designing the system’s innards, which they’re calling “Supercharged PC architecture”
What the heck do they mean by supercharged? Well, it’s all a little technical and complicated – but Cerny and Sony’s other hardware engineers have spent long hours trying to come up with a system that isn’t held back by the sort of bottlenecks you get on PCs.
"A typical PC GPU has two buses," said Cerny. "There’s a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication — any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required.”
So how does one get around that sort of thing? According to Cerny, they’ve made three fundamental changes to the architecture, to make it more efficient.
- "First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that’s being passed back and forth between CPU and GPU is small, you don’t have issues with synchronization between them anymore. And by small, I just mean small in next-gen terms. We can pass almost 20 gigabytes a second down that bus. That’s not very small in today’s terms — it’s larger than the PCIe on most PCs!
- "Next, to support the case where you want to use the GPU L2 cache simultaneously for both graphics processing and asynchronous compute, we have added a bit in the tags of the cache lines, we call it the ‘volatile’ bit. You can then selectively mark all accesses by compute as ‘volatile,’ and when it’s time for compute to read from system memory, it can invalidate, selectively, the lines it uses in the L2. When it comes time to write back the results, it can write back selectively the lines that it uses. This innovation allows compute to use the GPU L2 cache and perform the required operations without significantly impacting the graphics operations going on at the same time — in other words, it radically reduces the overhead of running compute and graphics together on the GPU."
- Thirdly, said Cerny, "The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, we’ve worked with AMD to increase the limit to 64 sources of compute commands — the idea is if you have some asynchronous compute you want to perform, you put commands in one of these 64 queues, and then there are multiple levels of arbitration in the hardware to determine what runs, how it runs, and when it runs, alongside the graphics that’s in the system."
If you’re not particularly technically minded (like Darryn), your eyes have quite likely glazed over by now. If you are though, you can quite plainly see that Sony’s hardware architects have put a heck of a lot of thought (of the forward-thinking variety) in to the system, and have learned from their past mistakes. The PS4 is going to be a beast, utilising a custom designed, more efficient take on existing PC hardware that should be an absolute dream for game developers.
There’s a lot more to it in Gamasutra’s 3-page in-depth look at what the PS4 offers– and it’s well worth a look if you’re even remotely interested in the PS4’s hardware.
Last Updated: April 25, 2013