Nvidia has unveiled its next generation Turing architecture at SIGGRAPH, a graphic arts conference. The Quadro RTX graphics cards for workstation users have been brought to market with this new technology. This is a small surprise, as most have expected Turing to appear in the GTX 1180 and other GPUs in the near future. These GeForce GPUs will probably still come first, but we now have some clear ideas of what to expect.
First, Turing is the name of the architecture, and it is the next evolution of Nvidia's Volta GPUs. Nvidia refers to Turing as his "eighth-generation GPU architecture," but I'm not sure how this number comes from. Working backwards, there are Volta, Pascal, Maxwell, Kepler, Fermi and Tesla (six generations). However, before naming architectures for famous scientists, Nvidia had six more generations of GeForce architectures. I think that everything before the introduction of the CUDA cores 2006 is considered as an "architecture"? It's not worse than Intel's CPU generations, so let's move on.
The big surprise with Turing is that it accelerates ray-tracing with Tensor cores, which were first available in the Volta GV1
 Tensor cores are more of a known size. These provide a dense set of computational units that can accelerate machine learning. The Volta GV100 contains 640 tensor cores with a peak computing speed of up to 110 TFLOPS (trillion floating-point operations per second) for FP16 (16-bit floating-point) workloads. With Turing, Nvidia says it can "perform up to 500 trillion tensor operations per second," even though these are INT4 (4-bit integer) operations rather than FP16 operations. Nvidia says Turing processors will have up to 576 tensor cores, one step back from Volta, but Turing processors should prove incredibly skilled in deep learning training and inference.
In addition to these new features, Turing will also include traditional graphics support with Nvidia CUDA cores. Many have – mistakenly – thought of the number of CUDA cores we would see in Turing, but Nvidia has now provided at least two numbers. The new Turing GPUs will be initially at a maximum of 4,608 CUDA cores, a small gradation of the maximum 5.20 in the Volta GV100. A lower product has 3,072 CUDA cores, which would be a big upgrade for mid-range GPUs. The Turing SM (Streaming Multiprocessor) has also been redesigned, with a new ability to execute floating point and integer operations in parallel. This gives Turing a maximum speed of 16 TFLOPS for floating-point operations (presumably FP32) and 16 TOPS for integer operations
The 16 TFLOPS number also gives us a realistic target for turbo-clocks on the Turing GPUs. 4,608 cores performing FP32 FMA (two FLOPS) operations would require a clock speed of 1736 MHz to hit 16 TFLOPS. Forget the rumors of 5,120 cores at 1.5 GHz or 3.840 at 2.5 GHz. Everything indicates that Turing will have clock speeds similar to those of Pascal, with only 20 percent more CUDA cores. And one can expect that Nvidia strongly pushes the RT Cores and RTX branding.
Along with the discussion of the Turing architecture, Nvidia also talked about upcoming Quadro RTX cards scheduled for the 4th quarter of this year. These are professional GPUs designed for workstations to create movies and video content, CAD / CAM, and scientific workloads. The new naming scheme underscores the importance that Nvidia attaches to the RT core, and it is likely that future GeForce cards will carry the same RTX branding. As discussed at Reddit last week, Nvidia has registered a number of new brands including Quadro RTX and GeForce RTX. So the coming 1180 cards? You'll probably be GeForce RTX 1180 and so on.
Let's finish with some hard specifications from the Quadro parts. Nvidia has details on three models, Quadro RTX 8000, 6000 and 5000.
The above specifications look impressive, and elsewhere it was found that the new GDDR6 cards have 14GT / s memory. We do not know the memory bus width, but the 24/48 GB cards almost certainly have a 384-bit bus, while the 16 GB card uses a 256-bit bus. This corresponds to a memory bandwidth of 672 GB / s for the high-end models and 448 GB / s for the lower specifications. The Quadro cards also support 100GB / s NV24 ports that could access high-end GeForce 2-way SLI cards.
Nvidia has never developed a Quadro-specific GPU in the past. Instead, Quadro cards use the same GPUs as GeForce cards, but with modified drivers for professional workloads, along with other minor differences. I do not expect Turing to change this pattern, which means that the GPUs mentioned above will most likely appear in GeForce cards in the near future. Nvidia might disable the tensor cores on GeForce (to make the Quadro cards more desirable), but the RT cores will be almost safe. The question is: Which cards do the different GPUs get?
For the high-end cards, we got a GeForce RTX 1180 (or 2080 or some other number), which otherwise looks very similar to the Quadro RTX 6000 with half the VRAM and maybe a few SMs disabled. Another option is that the top model will be the new GeForce RTX 1180 Ti, with slimmed-down versions for the 1180 and 1170. I think the latter is more likely, for several reasons.
There is a big gap in the core numbers between the Quadro RTX 5000 and 6000, which implies another base GPU (eg GT100 and GT104). While the GeForce RTX 1160 may be a major upgrade from the current GTX 1060, this seems unlikely at best. More likely, the Quadro RTX 5000 hardware resembles the GeForce RTX 1170, with a lower core count model for the GeForce RTX 1160.
The other reason I think that the Quadro RTX 6000/8000 has a 1180 A Ti card (or something similar in the product stack priced at around $ 1,000 – $ 1,500) is that the cube size of the top Turing GPUs is really big. Nvidia lists the transistor count at 18.6 billion with a 754mm2 chip size. This is one of the largest GPUs Nvidia has ever produced, so there's barely room for a larger version of Turing. The Volta GV100 for reference is 21 billion transistors and 815mm2, while the Pascal GP102 in the GTX 1080 Ti is 11.8 billion transistors and a comparatively small 471mm2.
There are other options, but whatever the names and prices, we'll probably hear more at Nvidia's GeForce gaming celebration next week at Gamescom.