Nvidia GPU hardware

October 7, 2022

Nvidia GPU hardware

This began when I wanted to answer the question, "what is a CUDA core?". My next question was, "what kind of Nvidia card do I have, and how many cores does it have."

GeForce GT1030

My card is a MSI Graphics Card NVIDIA GEFORCE GT 1030 2GHD4 LP OC. I bought it in 2021 with two requirements.

It needed to support 4K on my 3840x2160 monitor.
It must not have a fan
The price must be around $100

This card now sells for $125. General advice was to get a 750 card for the same money, but I rejected that choice because 750 cards have fans. If I had bought an EVGA 750 card, there is a good chance the fans would never have turned on, since I use the card in 2D mode all the time.

What is CUDA?

Nvidia: what is CUDA

CUDA stands for "Compute Unified Device Architecture", which is pretty vague. You need to read the above article to get the big idea.

CUDA programming

GPU Assembly

The bottom line here is to forget it. Vendors generally do not even publish the instructions, or at any event they don't expect you to do things at that level.

The Hardware

The silicon is the Nvidia GP108 chip. This was made by Samsung for Nvidia.

released May, 2017
Pascal architecture
14 nm process
1,800 million transistors
74 mm^2 die size
3 SM units (streaming muliprocessors)
384 shading units (CUDA cores)
24 TMU units (texture mapping units)
16 ROPS (raster output pipelines aka raster output units)
96 SFU units (special function units)
3 TPC units (texture processor cluster)
1 GPC (graphics processing cluster)
each SM has a 48K L1 Cache
512K shared cache

The following white paper talks about the Pascal architecture in the "Tesla P100", which as 56 SM units, but may still be informative for understanding Pascal.

Pscall architecture white paper (45 pages, PDF)

Maxwell came first, then Pascal, and the world moved on to "Turing". I hear about Fermi and Kepler, and even Ampere; but I am not trying to keep up with all of this. In March of 2022, Nvidia introduced "Hopper", which displaces "Ampere" The H100 in Hopper has 80,000 million transistors and is geared to speed up various AI calculations.
My GP108 chip has what is designated as a 384:24:16 core.

Each SM has 128 CUDA cores. Each CUDA contains an ALU.
Each SM also has 4 double precision FP ALU, and a half precision FP ALU which handles a vector of two half precision floating operations.

Have any comments? Questions? Drop me a line!

Adventures in Computing / [email protected]