Nvidia’s 2016 roadmap shows huge performance gains from upcoming Pascal architecture - When Nvidia’s next-generation GPU architecture arrives next year...
by
, 03-18-2015 at 03:23 PM (1261 Views)
At Nvidia’s keynote today to kick off GTC, CEO Jen-Hsun Huang spent most of his time discussing Nvidia’s various deep learning initiatives and pushing the idea of Tegra as integral to the self-driving car. He did, however, take time to introduce a new Titan X GPU — and to discuss the future of Nvidia’s roadmap.
When Nvidia’s next-generation GPU architecture arrives next year, codenamed Pascal, it’s going to pack a variety of performance improvements for scientific computing — though their impact on the gaming world is less clear.
Let’s start at the beginning:
Pascal is Nvidia’s follow-up to Maxwell, and the first desktop chip to use TSMC’s 16nmFF+ (FinFET+) process. This is the second-generation follow-up to TSMC’s first FinFET technology — the first generation is expected to be available this year, while FF+ won’t ship until sometime next year. This confirms that Nvidia chose to skip 20nm — something we predicted nearly three years ago.
Jen-Hsun claims that Pascal will achieve over 2x the performance per watt of Maxwell in Single Precision General Matrix multiplication. But there are two caveats to this claim, as far as gamers are concerned. First, recall that improvements to performance per watt, while certainly vital and important, are not the same thing as improvements to top-line performance. The second thing to keep in mind is that boosting the card’s SGEMM performance doesn’t necessarily tell us much about gaming.
The graph above, drawn from Nvidia’s own files on Fermi-based Tesla cards compared with K20 (GK110) makes the point. While K20X was much, much faster than Fermi, it was rarely 3x faster in actual gaming tests, as this comparison from Anandtech makes clear, despite being 3.2x faster than Fermi in SGEMM calculations.
Pascal’s next improvement will be its use of HBM, or High Bandwidth Memory. Nvidia is claiming it will offer up to 32GB of RAM per GPU at 3x the memory bandwidth. That would put Pascal at close to 1TB of theoretical bandwidth depending on RAM clock — a huge leap forward for all GPUs.
Jen-Hsun’s napkin math claims Pascal will offer up to 10x Maxwell performance “in extremely rough estimates.”
Note: Nvidia might roll out that much memory bandwidth to its consumer products, but 32GB frame buffers are unlikely to jump to the mainstream next generation. Even the most optimistic developers would be hard-pressed to use that much RAM when the majority of the market is still using GPUs with 2GB or less.
Pascal will be the first Nvidia product to debut with variable precision capability. If this sounds familiar, it’s because AMD appears to have debuted a similar capability last year.
It’s not clear yet how Nvidia’s lower-precision capabilities dovetail with AMDs, but Jen-Hsun referred to 4x the FP16 performance in mixed mode compared with standard (he might have been referencing single or double-precision).
Jen-Hsun’s napkin math claims Pascal will offer up to 10x Maxwell performance “in extremely rough estimates.”
Finally, Pascal will be the first Nvidia GPU to use NVLink, a custom high-bandwidth solution for Nvidia GPUs. Again, for now, NVLink is aimed at enterprise customers — last year, Jen-Hsun noted that the implementations for ARM and IBM CPUs had been finished, but that x86 chips faced non-technical issues (likely licensing problems). Nvidia could still use NVLink in a consumer dual-GPU card, however.
Pascal seems likely to deliver a huge uptick in Nvidia’s performance and efficiency. And given that the company managed to eke the equivalent of a generation’s worth of performance out of Maxwell while sticking with 28nm, there’s no reason to think it won’t pull it off. In the scientific market, at least, Nvidia is gunning for Xeon Phi — AMD has very little presence in this space, and that seems unlikely to change. If Sunnyvale does launch a new architecture in the next few months, we could actually see some of these features debuting first on Team Red, but the fabled Fiji’s capabilities remain more rumor than fact.
More...