Nvidia believes its 2016 GPU’s will be 10 times as powerful as Maxwell

Right now, Nvidia’s Titan X is the undisputed king of the GPU’s – and for $1000, I’d bloody well hope so. It could face stiff competition this year from AMD’s R9 390X, which speculation, rumour and hearsay suggests will offer better performance at a fraction of the price. Already though, Nvidia’s punting its next generation of GPU, its Pascal-based cards, which it says will offer 10x the performance of its already impressive Maxwell chips.


The increase in performance says Nvidia, comes from its stacked 3D memory, and the introduction of NVlink, a new new communications channel that will enable much greater flow of data between the CPU and GPU. NVlink will also allow for more GPU’s to be duct-taped together – with up to 8 GPU’s able to be linked for both professional and gaming purposes.

Here’s what NVidia says will make 2016’s Pascal-based GPU’s so enticing:

  • 3D Memory: Stacks DRAM chips into dense modules with wide interfaces, and brings them inside the same package as the GPU. This lets GPUs get data from memory more quickly – boosting throughput and efficiency – allowing us to build more compact GPUs that put more power into smaller devices. The result: several times greater bandwidth, more than twice the memory capacity and quadrupled energy efficiency.
  • Unified Memory: This will make building applications that take advantage of what both GPUs and CPUs can do quicker and easier by allowing the CPU to access the GPU’s memory, and the GPU to access the CPU’s memory, so developers don’t have to allocate resources between the two.
  • NVLink: Today’s computers are constrained by the speed at which data can move between the CPU and GPU. NVLink puts a fatter pipe between the CPU and GPU, allowing data to flow at more than 80GB per second, compared to the 16GB per second available now.
  • Pascal Module: NVIDIA has designed a module to house Pascal GPUs with NVLink. At one-third the size of the standard boards used today, they’ll put the power of GPUs into more compact form factors than ever before.

Mixed-Precision Computing for Greater Accuracy

Mixed-precision computing enables Pascal architecture-based GPUs to compute at 16-bit floating point accuracy at twice the rate of 32-bit floating point accuracy.

Increased floating point performance particularly benefits classification and convolution – two key activities in deep learning – while achieving needed accuracy.

3D Memory for Faster Communication Speed and Power Efficiency

Memory bandwidth constraints limit the speed at which data can be delivered to the GPU. The introduction of 3D memory will provide 3X the bandwidth and nearly 3X the frame buffer capacity of Maxwell. This will let developers build even larger neural networks and accelerate the bandwidth-intensive portions of deep learning training.

Pascal will have its memory chips stacked on top of each other, and placed adjacent to the GPU, rather than further down the processor boards. This reduces from inches to millimeters the distance that bits need to travel as they traverse from memory to GPU and back. The result is dramatically accelerated communication and improved power efficiency.

NVLink – for Faster Data Movement

The addition of NVLink to Pascal will let data move between GPUs and CPUs five to 12 times faster than they can with today’s current standard, PCI-Express. This is greatly benefits applications, such as deep learning, that have high inter-GPU communication needs.

NVLink allows for double the number of GPUs in a system to work together in deep learning computations. In addition, CPUs and GPUs can connect in new ways to enable more flexibility and energy efficiency in server design compared to PCI-E.

It does mean you’d probably have to buy a new board to make use of a new card, but if it would genuinely produces 10x the output of a Maxwell card, it’d be worth it.

Last Updated: March 18, 2015

Geoffrey Tim

Geoffrey Tim

