• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

why are gpus so powerful?

hat

Enthusiast
Joined
Nov 20, 2006
Messages
21,732 (3.41/day)
Location
Ohio
System Name Starlifter :: Dragonfly
Processor i7 2600k 4.4GHz :: i5 10400
Motherboard ASUS P8P67 Pro :: ASUS Prime H570-Plus
Cooling Cryorig M9 :: Stock
Memory 4x4GB DDR3 2133 :: 2x8GB DDR4 2400
Video Card(s) PNY GTX1070 :: Integrated UHD 630
Storage Crucial MX500 1TB, 2x1TB Seagate RAID 0 :: Mushkin Enhanced 60GB SSD, 3x4TB Seagate HDD RAID5
Display(s) Onn 165hz 1080p :: Acer 1080p
Case Antec SOHO 1030B :: Old White Full Tower
Audio Device(s) Creative X-Fi Titanium Fatal1ty Pro - Bose Companion 2 Series III :: None
Power Supply FSP Hydro GE 550w :: EVGA Supernova 550
Software Windows 10 Pro - Plex Server on Dragonfly
Benchmark Scores >9000
What makes a GPU so godly powerful in comparison to cpus?
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,055 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
because there are a lot of arithmetic units that are all EXTREMELY simple compared to cpus.

gpus are good at doing a large number of identical simple operations (multiply + add) on a lot of elements

gpus suck horribly at branching (everything that uses if)
 

hat

Enthusiast
Joined
Nov 20, 2006
Messages
21,732 (3.41/day)
Location
Ohio
System Name Starlifter :: Dragonfly
Processor i7 2600k 4.4GHz :: i5 10400
Motherboard ASUS P8P67 Pro :: ASUS Prime H570-Plus
Cooling Cryorig M9 :: Stock
Memory 4x4GB DDR3 2133 :: 2x8GB DDR4 2400
Video Card(s) PNY GTX1070 :: Integrated UHD 630
Storage Crucial MX500 1TB, 2x1TB Seagate RAID 0 :: Mushkin Enhanced 60GB SSD, 3x4TB Seagate HDD RAID5
Display(s) Onn 165hz 1080p :: Acer 1080p
Case Antec SOHO 1030B :: Old White Full Tower
Audio Device(s) Creative X-Fi Titanium Fatal1ty Pro - Bose Companion 2 Series III :: None
Power Supply FSP Hydro GE 550w :: EVGA Supernova 550
Software Windows 10 Pro - Plex Server on Dragonfly
Benchmark Scores >9000
So what about Intel's Larrabee thing, it's pretty much a mini supercomputer (it has lots of cpus on it, acting as a gpu).

I assume this would be good at both branching and basic arithmetic?
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,055 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
nobody knows the details, but i find it hard to believe that intel will implement the full x86 instruction set in a performant way

wikipedia:

"Larrabee's x86 cores will be much simpler than those on a Core 2 processor, not using out-of-order execution. This will allow them to be much smaller, so more can fit on a single chip. Other differences include the addition of a new set of extended SIMD instructions similar to SSE but more focused on graphics applications, and 4-way simultaneous multithreading for each core."
 
Joined
Aug 30, 2006
Messages
7,198 (1.12/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
My understanding is it IS a full x86 instruction set. ie these x86 instructions:
http://home.comcast.net/~fbui/intel.html#arch
http://www.cs.cmu.edu/~410/doc/intel-isr.pdf
http://webster.cs.ucr.edu/AoA/DOS/pdf/ch06.pdf

The question is open how many of the x87 instructions survived (FP math). Probably not many. According to the P54C datasheet, all x87 instructions are included.

However, is it WITHOUT MMX, MMX+, 3DNOW, SSE, SSE2, SSE3, SSE4

ref: P54C was prior to P55, which introduced MMX. P54C was basically the first pentiums: ftp://download.intel.com/design/pentium/datashts/24199710.pdf

In place of MMX, etc, is a new set of SIMD instructions which, as w1zz alluded to, are mostly MUL + ADD (and other instructions) that operate on "vector data", ie not 32-bits, but 128bits or more at the same time. The chosen instructions are designed to fit the purpose of what a GPU does.My guess is that Larrabee will implement some MASSIVE vectors, e.g. 512bit or 1024 bits. Or it will operate on column matrices (multiple vectors simultaneously), to process a whole bunch of "coloured" pixels simultaneously.

HOWEVER, I do think there will be some additional stuff in there: Intel is NOT just building a GPU here... they are building a general purpose "transputer" that is designed to render CUDA etc. irrelevant.
 
Last edited:

wolf

Performance Enthusiast
Joined
May 7, 2007
Messages
7,756 (1.25/day)
System Name MightyX
Processor Ryzen 5800X3D
Motherboard Gigabyte X570 I Aorus Pro WiFi
Cooling Scythe Fuma 2
Memory 32GB DDR4 3600 CL16
Video Card(s) Asus TUF RTX3080 Deshrouded
Storage WD Black SN850X 2TB
Display(s) LG 42C2 4K OLED
Case Coolermaster NR200P
Audio Device(s) LG SN5Y / Focal Clear
Power Supply Corsair SF750 Platinum
Mouse Corsair Dark Core RBG Pro SE
Keyboard Glorious GMMK Compact w/pudding
VR HMD Meta Quest 3
Software case populated with Artic P12's
Benchmark Scores 4k120 OLED Gsync bliss
i would say its basically because they are massively parallel in nature. without getting in depth whatsoever...
 
Top