View Full Version : why are gpus so powerful?
What makes a GPU so godly powerful in comparison to cpus?
W1zzard
07-06-2008, 10:21 PM
because there are a lot of arithmetic units that are all EXTREMELY simple compared to cpus.
gpus are good at doing a large number of identical simple operations (multiply + add) on a lot of elements
gpus suck horribly at branching (everything that uses if)
So what about Intel's Larrabee thing, it's pretty much a mini supercomputer (it has lots of cpus on it, acting as a gpu).
I assume this would be good at both branching and basic arithmetic?
W1zzard
07-06-2008, 10:25 PM
nobody knows the details, but i find it hard to believe that intel will implement the full x86 instruction set in a performant way
wikipedia:
"Larrabee's x86 cores will be much simpler than those on a Core 2 processor, not using out-of-order execution. This will allow them to be much smaller, so more can fit on a single chip. Other differences include the addition of a new set of extended SIMD instructions similar to SSE but more focused on graphics applications, and 4-way simultaneous multithreading for each core."
lemonadesoda
07-06-2008, 10:34 PM
My understanding is it IS a full x86 instruction set. ie these x86 instructions:
http://home.comcast.net/~fbui/intel.html#arch
http://www.cs.cmu.edu/~410/doc/intel-isr.pdf
http://webster.cs.ucr.edu/AoA/DOS/pdf/ch06.pdf
The question is open how many of the x87 instructions survived (FP math). Probably not many. According to the P54C datasheet, all x87 instructions are included.
However, is it WITHOUT MMX, MMX+, 3DNOW, SSE, SSE2, SSE3, SSE4
ref: P54C was prior to P55, which introduced MMX. P54C was basically the first pentiums: ftp://download.intel.com/design/pentium/datashts/24199710.pdf
In place of MMX, etc, is a new set of SIMD instructions which, as w1zz alluded to, are mostly MUL + ADD (and other instructions) that operate on "vector data", ie not 32-bits, but 128bits or more at the same time. The chosen instructions are designed to fit the purpose of what a GPU does.My guess is that Larrabee will implement some MASSIVE vectors, e.g. 512bit or 1024 bits. Or it will operate on column matrices (multiple vectors simultaneously), to process a whole bunch of "coloured" pixels simultaneously.
HOWEVER, I do think there will be some additional stuff in there: Intel is NOT just building a GPU here... they are building a general purpose "transputer" that is designed to render CUDA etc. irrelevant.
i would say its basically because they are massively parallel in nature. without getting in depth whatsoever...
vBulletin® v3.7.0, Copyright ©2000-2008, Jelsoft Enterprises Ltd.