View Full Version : AMD FireStream 9250 Breaks the 1 Teraflop Barrier
malware
06-16-2008, 07:08 AM
At the International Supercomputing Conference, AMD today introduced its next-generation stream processor, the AMD FireStream 9250, specifically designed to accelerate critical algorithms in high-performance computing (HPC), mainstream and consumer applications. Leveraging the GPU design expertise of AMD’s Graphics Product Group, AMD FireStream 9250 breaks the one teraflop barrier for single precision performance. It occupies a single PCI slot, for unmatched density and with power consumption of less than 150 watts, the AMD FireStream 9250 delivers an unprecedented rate of performance per watt efficiency with up to eight gigaflops per watt.
[---]
Customers can leverage AMD’s latest FireStream offering to run critical workloads such as financial analysis or seismic processing dramatically faster than with CPU alone, helping them to address more complex problems and achieve faster results. For example, developers are reporting up to a 55x performance increase on financial analysis codes as compared to processing on the CPU alone, which supports their efforts to make better and faster decisions. Additionally, the use of flexible GPU technology rather than custom accelerators assists those creating application-specific systems to enhance and maintain their solutions easily.
The AMD FireStream 9250 stream processor includes a second-generation double-precision floating point hardware implementation delivering more than 200 gigaflops, building on the capabilities of the earlier AMD FireStream 9170, the industry’s first GP-GPU with double-precision floating point support. The AMD FireStream 9250’s compact size makes it ideal for small 1U servers as well as most desktop systems, workstations, and larger servers and it features 1GB of GDDR3 memory, enabling developers to handle large, complex problems.
Driving broad consumer adoption with open systems
AMD enables development of the FireStream family of processors with its AMD Stream SDK, designed to help developers create accelerated applications for AMD FireStream, ATI FireGL and ATI Radeon GPUs. AMD takes an open-systems approach to its stream computing development environment to ensure that developers can access and build on the tools at any level. AMD offers published interfaces for its high-level language API, intermediate language, and instruction set architecture; and the AMD Stream SDK’s Brook+ front-end is available as open source code.
In keeping with its open systems philosophy, AMD has also joined the Khronos Compute Working Group. This working group’s goals include developing industry standards for data parallel programming and working with proposed specifications like OpenCL. The OpenCL specification can help provide developers with an easy path to development across multiple platforms.
“An open industry standard programming specification will help drive broad-based support for stream computing technology in mainstream applications,” said Rick Bergman, senior vice president and general manager, Graphics Product Group, AMD. “We believe that OpenCL is a step in the right direction and we fully support this effort. AMD intends to ensure that the AMD Stream SDK rapidly evolves to comply with open industry standards as they emerge.”
Accelerating industry adoption
The growth of the stream computing market has accelerated over the past few years with Fortune 1000 companies, leading software developers and academic institutions utilizing stream technology to achieve tremendous performance gains across a variety of applications.
“Stream computing is increasingly important for mainstream and consumer applications and is no longer limited to just the academic or engineering industries. Today we are truly seeing a fundamental shift in emerging system architectures,” said Jon Peddie, president, Jon Peddie Research. “As the industry’s only provider of both high-performance discrete GPUs and x86-compatible CPUs, AMD is uniquely well-suited to developing these architectures.”
AMD customers, including ACCIT, Centre de Physique de Particules de Marseille, Neurala and Telanetix are using the AMD Stream SDK and current AMD FireStream, ATI FireGL or ATI Radeon boards to achieve dramatic performance gains on critical algorithms in HPC, workstation and consumer applications. Currently, Neurala reports that it is achieving 10-200x speedups over the CPU alone on biologically inspired neural models, applicable to finance, image processing and other applications.
AMD is also working closely with world class application and solution providers to ensure customers can achieve optimum performance results. Stream computing application and solution providers include CAPS entreprise, Mercury Computer Systems, RapidMind, RogueWave and VizExperts. Mercury Computer Systems provides high-performance computing systems and software designed for complex image, sensor, and signal processing applications. Its algorithm team reports that it has achieved 174 GFLOPS performance for large 1D complex single-precision floating point FFTs on the AMD FireStream 9250.
Pricing and availability
AMD plans to deliver the FireStream 9250 and the supporting SDK in Q3 2008 at an MSRP of $999 USD. AMD FireStream 9170, the industry’s first double-precision floating point stream processor, is currently available for purchase and is competitively priced at $1,999 USD. For more information about AMD FireStream 9250 or AMD FireStream 9170 or AMD’s complete line of stream computing solutions, please visit http://www.amd.com/stream.
Source: AMD (http://www.amd.com/us-en/Corporate/VirtualPressRoom/0,,51_104_543~126593,00.html)
btarunr
06-16-2008, 08:00 AM
Here's the edge that AMD has over NVidia, an x86 license. If only the potential of Stream Processors are tapped in the AMD Fusion, we have an incredibly powerful CPU. All they need is a translation layer between x86 and FireStream, x86 commands will then be handled by Stream Processors. Phenom X320 anyone?
From_Nowhere
06-16-2008, 08:28 AM
1 Teraflop...Neat... I wonder of that makes it "Vista Capable." (Just kidding)
That's also a low price compared to its predecessor
Oh and:
A Phenom X320 you say? What about a Phenom X320 Black Edition?
Exceededgoku
06-16-2008, 09:05 AM
Oh and:
A Phenom X320 you say? What about a Phenom X320 Black Edition?
What!? They have a special version for black people now?! Why can't I buy a non black edition???
Haha :laugh::laugh::laugh:
These are good times for AMD, I think we are about to see a massive revival stunt :).
pentastar111
06-16-2008, 11:35 AM
What!? They have a special version for black people now?! Why can't I buy a non black edition???
Haha :laugh::laugh::laugh:
These are good times for AMD, I think we are about to see a massive revival stunt :). WHAT??:eek:
lemonadesoda
06-16-2008, 11:51 AM
Exciting news... but I'm not impressed. (Now I'm disappointed). The PS3 runs at 220 gigaflops for $400. AMD want $999 for the same power, but without any other "PS3" features.
A regular Q6600 Quad manages 30 Gflops: http://img.tomshardware.com/us/2007/07/16/cpu_charts_2007/c_sandra_cpu_mflops.png (7.5 Gflops per core).
That means the Firestream is A LOT FASTER, about 5x faster. NOT 55x. Perhaps it's "55x faster" than a shiddy AMD single core?
Ripper3
06-16-2008, 12:57 PM
IT wastes less power than the PS3, at 150W, can be used with pretty much any PC with a PCI slot, and probably runs cooler, and might be more stable than the PS3 (don't forget you pay for a lot when it comes to these sorts of things).
Oh, and not forgetting that, even if it is a quick setup, installing a GP-GPU into a PCI slot and installing software for it, is probably quicker than setting up a PS3 for these sorts of operations, if you already have the rest of the computer setup (which most company might have, as they might jsut buy this and slot it in, replacing the 9170).
That 55x increase in performance over a CPU isn't specific, no, but don't forget that translating GFLOPS into real-world performance isn't a direct comparison. For one thing, that CPU could have been loaded with other processes, the application could be optimised for GPUs, etc.
Beertintedgoggles
06-16-2008, 12:57 PM
Exciting news... but I'm not impressed. (Now I'm disappointed). The PS3 runs at 220 gigaflops for $400. AMD want $999 for the same power, but without any other "PS3" features.
A regular Q6600 Quad manages 30 Gflops: http://img.tomshardware.com/us/2007/07/16/cpu_charts_2007/c_sandra_cpu_mflops.png (7.5 Gflops per core).
That means the Firestream is A LOT FASTER, about 5x faster. NOT 55x. Perhaps it's "55x faster" than a shiddy AMD single core?
Lets see, the article states 1 Teraflop per second (that's 1024 Gflops). So in your logic that really makes it worth ~ $1900 (4.65 x $400 for the 220 gigaflop/s PS3). In theoretical number crunching those numbers also put it at 1024/30 = 34 times more powerful than the quad core Intel. The 55x number they stated was only on financial analysis codes although you're right that they don't state which CPU they are using to compare numbers. I just hope AMD finds a way to integrate this into mainstream computing that will see some real world results for us 'average' users.
Exceededgoku
06-16-2008, 01:02 PM
WHAT??:eek:
Lol I think you have to watch Uk tv to understand it, http://www.youtube.com/watch?v=6jxjuoG9uYQ
Also as far as I know the Cell can only perform simple computations quickly, is that true? I really don't know anything about this sort of tech. I mean are they able to do the same complexity of data?
turtile
06-16-2008, 01:05 PM
Exciting news... but I'm not impressed. (Now I'm disappointed). The PS3 runs at 220 gigaflops for $400. AMD want $999 for the same power, but without any other "PS3" features.
A regular Q6600 Quad manages 30 Gflops: http://img.tomshardware.com/us/2007/07/16/cpu_charts_2007/c_sandra_cpu_mflops.png (7.5 Gflops per core).
That means the Firestream is A LOT FASTER, about 5x faster. NOT 55x. Perhaps it's "55x faster" than a shiddy AMD single core?
The PS3 uses a GPU. According to your article, it is over 133x faster than a single Q6600 core and over 33x faster than all four.
You can't compare it to the PS3. You're not only buying the hardware (which is optimized and much more reliable) but software. The PS3 is sold by the millions while this card will sell much less. If the PS3 sold the same amount, it would cost a lot more.
Ripper3
06-16-2008, 01:14 PM
Oh wow, I haven't seen balls of steel in so freaking long! That is a classic one.
Beertintedgoggles, that is very true, didn't even notice that, heheh. Good catch there, and it really does make the GP-GPU much more valuable. Huge increase in performance-per-watt and performance-per-$ to the PS3.
panchoman
06-16-2008, 02:02 PM
cant wait to see this combined with the firesteam gpu's which also are suppossed to be hitting over 1 teraflop, it would be a really powerful system!
KieranD
06-16-2008, 02:51 PM
None of this matters, i will only get excited when i see some real world performance. Basically down on paper the Phenoms should waste the Intel competition not that it really worked that way did it?
Remember the hd2000 series it was great on paper way better than some of the Nvidia stuff yet it failed in real world terms.
This could be good especially since it can run in any board with a pci slot.
PS3 oh please that is a console this is a piece of pc hardware go compare it to a wii or an xbox 360. People say oh the CELL processor is powerful it has x amount of this and x amount of that but in reality it is stuck with the gpu and other hardware because you cant simply upgrade a PS3. On its own the CELL is useless because its not programed for any other software other than PS3 games and some linux OS.
batmang
06-16-2008, 03:20 PM
Go Amd Go.
PVTCaboose1337
06-16-2008, 03:28 PM
Congrats on the best workstation gfx card AMD!
btarunr
06-16-2008, 03:48 PM
Exciting news... but I'm not impressed. (Now I'm disappointed). The PS3 runs at 220 gigaflops for $400. AMD want $999 for the same power, but without any other "PS3" features.
How many GFLOPs does Intel give you for ~$1200 that you pay for a QX9770?
FireStream is more of something aimed to compete with NVidia Tesla which have the same 'inviting' prices.
Congrats on the best workstation gfx card AMD!
AMD FireStream is not a graphics card, for the know.
Solaris17
06-16-2008, 06:44 PM
so tell me exactly what this is?
like in relation?
i know its not a grafx card but is it like a math co-processor?
or is it like a physx card?
or is it a physx card that acts as a co-processor?
lemonadesoda
06-16-2008, 07:07 PM
Lets see, the article states 1 Teraflop per second (that's 1024 Gflops). So in your logic that really makes it worth ~ $1900 (4.65 x $400 for the 220 gigaflop/s PS3). In theoretical number crunching those numbers also put it at 1024/30 = 34 times more powerful than the quad core Intel. The 55x number they stated was only on financial analysis codes although you're right that they don't state which CPU they are using to compare numbers. I just hope AMD finds a way to integrate this into mainstream computing that will see some real world results for us 'average' users.
Lets see that again. The article states 200 gigaflops for DP. All other benchmarks I gave were DP. You cannot do "financial math" with SP. Or rather... you can but you shouldnt. And certainly dont mix SP with DP benchmarks. That's worse than apples and oranges. It's like comparing the number of bananas with the number of bunches of bananas.
And 1 teraflop = 1000 gigaflops. WTF did 1024 come from? A math bug? Working in SP? :roll: :pimp:
How many GFLOPs does Intel give you for ~$1200 that you pay for a QX9770?NOBODY creates number crunchers with overpriced QX9770. You use a DP/MP xeon system. You can quite easily stick 2 quad xeons in a cheap workstation mb and have a very flexible system.
http://www.cs.berkeley.edu/%7Esamw/research/papers/ipdps08.pdf
Viscarious
06-16-2008, 07:15 PM
And 1 teraflop = 1000 gigaflops. WTF did 1024 come from? A math bug? Working in SP? :roll: :pimp:
You cant be serious. Where do you get your information?
btarunr
06-16-2008, 07:35 PM
You cant be serious. Where do you get your information?
FLOP (floating-point operation) is not a data figure (or function of 8 (8, 16, 32, 64, 128.....1024....8192)). 1000 GFLOP = 1 TFLOP just as 1000 Megawatts make a TW.
lemonadesoda
06-16-2008, 07:42 PM
You cant be serious. Where do you get your information?
Please dont take this personally, but ROFL.
http://en.wikipedia.org/wiki/SI_prefix
btarunr
06-16-2008, 07:50 PM
so tell me exactly what this is?
like in relation?
i know its not a grafx card but is it like a math co-processor?
or is it like a physx card?
or is it a physx card that acts as a co-processor?
Your machine relies entirely on CPU for running apps, with the visual part taken care of by the GPU(s). A HPC device such as NVidia Tesla or ATI FireStream can be used as massive boosts to raw computational power (like you're doing a big research on a small budget), you don't need to hire a supercomputing firm that eats into your research budget big time. All you need is to buy either a...
1. Nifty little HPC Card (The size of a 8800 GTX) that you can install right into your workstation.
2. A HPC system that has several cards running in tandem that you can connect to your lab's network.
3. A whole stack of 1U or 2U sized rack-mount systems with several GPU's each so one good rack packs the power of a >100-PC cluster.
...based on your requirement.
Then, you obtain SDK's from either ATI / NVIDIA. NVIDIA gives you the CUDA libraries that you can use your current IDE's to develop apps that run on these 'things' listed above. I'm sure AMD is working on one too.
Coding is not rocket-science. You code your apps (or buy/obtain licence of apps) that exploit these HPC setups.
Voila! You don't waste 10's of thousands of dollars of research budget on hiring supercomputers. You just buy these things once and your lab keeps them forever. Remember, hiring supercomps are extremely expensive, one session itself costs 1000's of dollars. Unless you really need the 1000's of TFLOPS, you should stay away from those things.
Therefore, HPC setups such as FireStream or Tesla can be extremely useful in Universities that don't own a supercomp.
Solaris17
06-16-2008, 07:56 PM
cool thanks :)
lemonadesoda
06-16-2008, 08:05 PM
^^ good answer.
The Ageia Physx is a similar card. It is basically *just* a math coprocessor too. Thing was... to try to make it a mainstream product, Ageia developed their SDK for games physics. I looked into the Physx for doing JUST math, for financial math in fact. Unfortunately, the SDK and math libraries were designed for a different purpose, and it wasnt that accessible for other applications / financial math. Possible in theory, but would have required writing one's own SDK components.
Other products in this area are:
Clearspeed http://www.clearspeed.com/
Spursengine SE1000 (for video, not general math) http://en.wikipedia.org/wiki/SpursEngine
Cell Broadband Engine http://www.mc.com/microsites/cell/
btarunr
06-16-2008, 08:08 PM
I sooo wanted the CELL to come to the desktop. If only Windows supported PPC, it would have been possible. Afterall, the CELL is based on the PowerPC machine architecture.
eidairaman1
06-16-2008, 08:12 PM
Firestream seems to be a Highly Programmable Processor- what im saying is it can be programmed for just about anything, Since the CPU isnt programmed, it just sits there.
Solaris17
06-16-2008, 08:14 PM
I sooo wanted the CELL to come to the desktop. If only Windows supported PPC, it would have been possible. Afterall, the CELL is based on the PowerPC machine architecture.
so did i i wish they actually sold it to people along ith the motherboards.....because suse fedora and ubuntu come with cell development tools for programs etc.....i might just pic up a ps3 mod it to have a huge HDD and use it as a desktop running linux.
btarunr
06-16-2008, 08:24 PM
Yes. Afterall, distro's such as YDL are based on PPC-supportive kernels. Close to every kind of OS supports (or did support in the past) PPC. IIRC, early versions of Windows did support PPC, they scrapped the support since Windows 98.
Exceededgoku
06-16-2008, 09:01 PM
Cell isn't X86 (is it???) and isn't an in order CPU so performance would suck in day to day tasks like games and windows :S......
btarunr
06-16-2008, 09:05 PM
Cell isn't X86 (is it???) and isn't an in order CPU so performance would suck in day to day tasks like games and windows :S......
No, CELL is not an x86 CPU. It's an in-the-order CPU based on the PowerPC machine architecture. Linux supports PPC. (A distro supporting PPC should have the PPC-supportive version of the kernel).
yogurt_21
06-16-2008, 09:15 PM
RapidMind has reported a 55x speedup over CPU alone on binomial options pricing calculators. The comparison is versus Quantlib running on a single core of a Dual-Core AMD Opteron™ 2352 processor on Tyan S2915 w/ Win XP 32 (Palomar Workstation from Colfax)
Neurala comparison is against dual AMD Opteron 248 processor (using only a single processor for comparison) w/ 2GB SDRAM DDR 400 ECC dual channel and SUSE Linux 10 (custom kernel)
Mercury benchmark system details: Intel Core2 6820 @ 2.13 GHz w/ 3GB of RAM, FireStream 9250 stream processor
you know I wish the manufacturers would stop number stacking and provide actual resuslts. lol it's nice to see an evolution towards the future at half the price of the previous, but choosing peak numbers to showcase it isn't going to impress the people who run the research projects. they're not your average joe consumer.
btarunr
06-16-2008, 09:19 PM
you know I wish the manufacturers would stop number stacking and provide actual resuslts. lol it's nice to see an evolution towards the future at half the price of the previous, but choosing peak numbers to showcase it isn't going to impress the people who run the research projects. they're not your average joe consumer.
The biggest evaluation of all this is the Folding@Home project. Since ages, F@H supported ATI's GPU's for GPU computing, and the numbers did translate to results.
WarEagleAU
06-16-2008, 09:31 PM
so with the 9170 costing more still even after this announcement, does that mean this new 9250 isnt as powerful? Congrats AMD on the breakthrough, but that is a tad steep price for a co processor. I can see Pharmaceutical and DNA/RNA Synthesis companies using these.
lemonadesoda
06-16-2008, 10:20 PM
For anyone following this thread, read http://www.rapidmind.net/pdfs/FinancialDataSheet.pdf
Basically, the 55x speedup quoted by AMD is:
1>> A single core Opteron running an opensource math library, COMPARED TO
2>> The FireStream running optimized math library SPECIFICALLY designed for financial math by RapidMind.
http://img.techpowerup.org/080616/Capture063.png
REAL COMPARISON
1./ Single core CPU, running inefficient C++ math library
2./ Replace math library with RapidMind, = 2x speedup
3./ Replace "single core" Opteron with "single core" Intel Core 2, = 2x speedup
4./ Replace single core with quad core = 4x speedup
http://img.techpowerup.org/080616/Capture062.png
So, actually, the REAL COMPARISON should be 55/16 = 3.5x speedup. At a price of $999.
OK, SO LETS USE A DUAL XEON SYSTEM ALTERNATIVE
5./ Upgrade to dual socket mainboard, one extra xeon, total $500, = 2 x speedup
That would give a net speedup of 1.75x to the FireStream but at a higher cost ($499), plus development time associated with using the SDK for FireStream and then having codethat could only run on the FireSteam. (THERE ARE GOOD SECURITY REASONS TO DO THIS... ESPECIALLY FOR PROPRIETARY FINANCE SOFTWARE).
IMO, 1.75x speed of a dual xeon workstation, is not all that impressive.
******
From looking closer at the hardware of FireStream, it seems to be essentially a GPU card with the "Video" bits removed. You could probably get a regular gaming card to do exactly the same. But I'm sure AMD will "lock" features within the BIOS, just like they do with the FireGL GPUs.
Congrats AMD on the breakthrough, but that is a tad steep price for a co processor. I can see Pharmaceutical and DNA/RNA Synthesis companies using these.
I agree, too expensive
But its not much of a breakthrough. Its a GPU in wolfs clothes, with an SDK not dissimilar to CUDA concept.
Smoke and mirrors by AMD.
imperialreign
06-16-2008, 11:16 PM
Sure, it looks a little blown out of proportion as it is; but look at the target audience for this capability as well - they've been listening to the blown out of proportion claims of Intel and nVidia for how long now? AMD is coming along with something that does work better, they're just exaggerating it a bit - still, for it's market, it's highly competitive, and I think it's great to see AMD being able to bring the goods in at least one field right now.
I'm curious, though, has anyone else noticed that AMD seems to have drastically changed their marketing strategies over the last 3-5 months? It seems to me that they've become a lot more aggressive in their marketing and claims, compared to how they used to be.
They're finally adopting the ruthless attitude of all the other financially successful and stable companies.
lemonadesoda
06-17-2008, 01:09 AM
a lot more aggressive in their marketing and claims
Can you translate that to English please? Choose one of the following:
1./ Bullshit
2./ Lies
3./ Misrepresentation
They're finally adopting the ruthless attitude of all the other financially successful and stable companies
And that one too, please:
A./ No integrity
B./ No ethics
C./ Short term profit before brand reputation and customer loyalty, ala, fool the customer with 1, 2, 3
imperialreign
06-17-2008, 03:59 AM
Can you translate that to English please? Choose one of the following:
1./ Bullshit
2./ Lies
3./ Misrepresentation
And that one too, please:
A./ No integrity
B./ No ethics
C./ Short term profit before brand reputation and customer loyalty, ala, fool the customer with 1, 2, 3
IDK about all that - ATI still at least provide some kind of basis for their claims, some representation of what they've tested to help support their propaganda - more than I can say for Intel, nVidia, MS, Creative, or any other market leader in the industry.
Sure, recently they might be 'twisting' the truth and stretching it as far as they can, but we're still given some kind of base to look at as well; unlike other companies who spit out propaganda that looks like they waved their voodoo stick over a spread sheeting while swinging chickens.
eidairaman1
06-17-2008, 05:26 AM
Dontforget Intel and Nvidia were doing that shit for years until the Other Companies started to step on their feet.
Can you translate that to English please? Choose one of the following:
1./ Bullshit
2./ Lies
3./ Misrepresentation
And that one too, please:
A./ No integrity
B./ No ethics
C./ Short term profit before brand reputation and customer loyalty, ala, fool the customer with 1, 2, 3
tkpenalty
06-17-2008, 07:07 AM
AMD might as well retool their GPUs for CPU usage-GPUs have massve FP calc speeds and they have an x86 liscense anyway.......
if AMD used their GPUs for CPUs... Intel would be screwed.
From_Nowhere
06-17-2008, 07:46 AM
^ That would be interesting... my question is, "Can it be done?"
btarunr
06-17-2008, 08:40 AM
^ That would be interesting... my question is, "Can it be done?"
Yes. AMD Fusion is a CPU with a GPU embedded. GPU means stream processors.
Even if a GPU the class of a HD2600 XT (120 SP's) was embedded, theoritically it means an added 50 GFLOPs at least.
lemonadesoda
06-17-2008, 08:59 AM
Interesting discussion http://www.simbiosys.ca/blog/2008/05/03/the-fast-and-the-furious-compare-cellbe-gpu-and-fpga/
They (quietly) point out that the GPGPU are fantastic for massively parallel calculations. But for general purpose mixed math they are aweful. Why? Because the processing power and benchmarks we keep reading about are based on calculations that are scalable via parallelization, so that, e.g. ALL 320 stream processors are put to good use.
If you were using the GPGPU to "re-calculate an EXCEL table", then divide performance by 320, since you wont get parallelization there. In such situations a CPU's FPU will PWN the GPGPU.
The GPGPU comes into its own ONLY when using the math library and SDK designed for it... AND when doing things like vector or matrix math, of SIMPLE additions, subtractions and multiplications.
An FPU will PWN a GPGU at trig math, for example.
tkpenalty
06-17-2008, 09:25 AM
Interesting discussion http://www.simbiosys.ca/blog/2008/05/03/the-fast-and-the-furious-compare-cellbe-gpu-and-fpga/
They (quietly) point out that the GPGPU are fantastic for massively parallel calculations. But for general purpose mixed math they are aweful. Why? Because the processing power and benchmarks we keep reading about are based on calculations that are scalable via parallelization, so that, e.g. ALL 320 stream processors are put to good use.
If you were using the GPGPU to "re-calculate an EXCEL table", then divide performance by 320, since you wont get parallelization there. In such situations a CPU's FPU will PWN the GPGPU.
The GPGPU comes into its own ONLY when using the math library and SDK designed for it... AND when doing things like vector or matrix math, of SIMPLE additions, subtractions and multiplications.
An FPU will PWN a GPGU at trig math, for example.
FPU is designed for maths anyway...
Lets hope Fusion will give phenom the well needed performance boost.
lemonadesoda
06-17-2008, 09:36 AM
OK, new news. http://www.tgdaily.com/content/view/37970/135/
Clearspeed's new math co-processor delivers 100 DP math (compared to Firestram 200 DP math) but with only 12W (compared to Firestream 150W).
Clearspeed CSX700 is the winner. It also has a better math library (faster) due to the CSX700 being a much more capable FPU than GPGPU (which is limited to simpler natives of plus, minus, multiply etc.)
Downside, $3000
btarunr
06-17-2008, 09:48 AM
Interesting discussion http://www.simbiosys.ca/blog/2008/05/03/the-fast-and-the-furious-compare-cellbe-gpu-and-fpga/
They (quietly) point out that the GPGPU are fantastic for massively parallel calculations. But for general purpose mixed math they are aweful. Why? Because the processing power and benchmarks we keep reading about are based on calculations that are scalable via parallelization, so that, e.g. ALL 320 stream processors are put to good use.
If you were using the GPGPU to "re-calculate an EXCEL table", then divide performance by 320, since you wont get parallelization there. In such situations a CPU's FPU will PWN the GPGPU.
The GPGPU comes into its own ONLY when using the math library and SDK designed for it... AND when doing things like vector or matrix math, of SIMPLE additions, subtractions and multiplications.
An FPU will PWN a GPGU at trig math, for example.
That's where the specialised SP's that handle both MADD/MUL come to play. 1 in every 5 SP's in the ATI Stream architecture are such. Of course, a GPU will never be able to perform out-of-the-order execution the way an x86 CPU does. A GPU requires you to send it instructions and data far more rapidly than you'd send a CPU (where the main memory and CPU staged caches pool data). We can put it this way, just as you have SIMD instruction sets (SSE and its successors), they might come up with an instruction set that lets apps exploit stream processors on a Fusion. Of course, other apps will have to rely on the CPU's FPU.
lemonadesoda
06-17-2008, 12:41 PM
Sure, recently they might be 'twisting' the truth and stretching it as far as they can, but we're still given some kind of base to look at as well; unlike other companies who spit out propaganda that looks like they waved their voodoo stick over a spread sheeting while swinging chickens.
ROFLCOPTERS
http://www.thinkgeek.com/images/products/zoom/roflcopter.jpg
KieranD
06-17-2008, 01:05 PM
cell would be useless because you cant run windows or mac on it and then you have no compatible motherboard with pci ex slots for expansion even then things like memory controllers ect
i think that the cell would be useless because youd only be able to run linux and whats the point in having a powerfull cpu for linux if all you can run is doom 3 and quake 4
KieranD
06-17-2008, 01:11 PM
using gpus for cpu is stupid i dont know if it could compute everthing quite like a cpu
either way gpus are different architecture from cpus youd have to totaly redesign the gpu to include cache and memory controllers ect
im not sure why youd want a math co processor
co processors are useless if you have multi threading on a cpu and the software is programed to use it fully
id like to see physics done on a core of a cpu or have a full single graphics card for physics but be able to add in a cheaper graphics card o take advantage
spud107
06-17-2008, 04:30 PM
this is interesting, from amd,
http://ati.amd.com/technology/streamcomputing/faq.html#5
Will the AMD FireStream SDK work on previous generation hardware?
To run the CAL/Brook+ SDK, you need a platform based on the AMD R600 GPU or later. R600 and newer GPUs are found with ATI Radeon™ HD2400, HD2600, HD2900 and HD3800 graphics board.
Which applications are best suited to Stream Computing?
Applications best suited to stream computing possess two fundamental characteristics:
A high degree of arithmetic computation per system memory fetch
Computational independence — arithmetic occurs on each processing unit without needing to be checked or verified by or with arithmetic occurring on any other processing unit.
Examples include:
Engineering — fluid dynamics
Mathematics — linear equations, matrix calculations
Simulations — Monte Carlo, molecular modeling, etc.
Financial — options pricing
Biological — protein structure calculations
Imaging — medical image processing
btarunr
06-17-2008, 04:33 PM
using gpus for cpu is stupid i dont know if it could compute everthing quite like a cpu
either way gpus are different architecture from cpus youd have to totaly redesign the gpu to include cache and memory controllers ect
im not sure why youd want a math co processor
co processors are useless if you have multi threading on a cpu and the software is programed to use it fully
id like to see physics done on a core of a cpu or have a full single graphics card for physics but be able to add in a cheaper graphics card o take advantage
Try to read the complete thread, learn something about it all. As for the CELL BE part. Stop regarding the CELL as "that which drives PS3". CELL was/is touted to have general-purpose applications. Driving a console is just a part of it. What do you think drives the Sony Bravia? CELL finds applications in several other devices such as display panels, etc., it's a PowerPC based processor. Had Apple not ditched PPC for x86 , you'd probably have the PowerMac (now Mac Pro) running a CELL BE.
spud107
06-17-2008, 08:24 PM
wonder if someone will fill something like this up? :D
http://img.techpowerup.org/080617/supercluster.jpg
http://www.picocomputing.com/images/SC3.jpg
from here http://www.picocomputing.com/ this can be used with laptops!!
eidairaman1
06-18-2008, 01:09 AM
More Cases need to have 10-12 PCI Space brackets and right angle motherboard connectors.
My Next Case probably be capable of Dual PSUs.
Shingoshi
10-14-2008, 10:07 PM
For anyone following this thread, read http://www.rapidmind.net/pdfs/FinancialDataSheet.pdf
Basically, the 55x speedup quoted by AMD is:
1>> A single core Opteron running an opensource math library, COMPARED TO
2>> The FireStream running optimized math library SPECIFICALLY designed for financial math by RapidMind.
http://img.techpowerup.org/080616/Capture063.png
REAL COMPARISON
1./ Single core CPU, running inefficient C++ math library
2./ Replace math library with RapidMind, = 2x speedup
3./ Replace "single core" Opteron with "single core" Intel Core 2, = 2x speedup
4./ Replace single core with quad core = 4x speedup
http://img.techpowerup.org/080616/Capture062.png
So, actually, the REAL COMPARISON should be 55/16 = 3.5x speedup. At a price of $999.
OK, SO LETS USE A DUAL XEON SYSTEM ALTERNATIVE
5./ Upgrade to dual socket mainboard, one extra xeon, total $500, = 2 x speedup
That would give a net speedup of 1.75x to the FireStream but at a higher cost ($499), plus development time associated with using the SDK for FireStream and then having codethat could only run on the FireSteam. (THERE ARE GOOD SECURITY REASONS TO DO THIS... ESPECIALLY FOR PROPRIETARY FINANCE SOFTWARE).
IMO, 1.75x speed of a dual xeon workstation, is not all that impressive.
******
From looking closer at the hardware of FireStream, it seems to be essentially a GPU card with the "Video" bits removed. You could probably get a regular gaming card to do exactly the same. But I'm sure AMD will "lock" features within the BIOS, just like they do with the FireGL GPUs.
I agree, too expensive
But its not much of a breakthrough. Its a GPU in wolfs clothes, with an SDK not dissimilar to CUDA concept.
Smoke and mirrors by AMD.
This is something that never makes any sense to me. That a company would spend more money (by locking out certain existing features), to make less money. Because in the net result, that's exactly what the result is. The fewer options you provide to your customers, the fewer customers you'll have buying your product. That's just sheer stupidity, all for the sake of selfishly thinking you're going to get more from less.
Shingoshi
Shingoshi
10-14-2008, 10:23 PM
cell would be useless because you cant run windows or mac on it and then you have no compatible motherboard with pci ex slots for expansion even then things like memory controllers ect
i think that the cell would be useless because youd only be able to run linux and whats the point in having a powerfull cpu for linux if all you can run is doom 3 and quake 4
The people who use Linux for the applications like these, have absolutely little if any concern for games. You're asking a question like, "why would I want a Ferrari, if I can't take it offroad? It's just the wrong question and assumptions involved here.
Shingoshi
Shingoshi
10-14-2008, 11:44 PM
I currently have a four-socket Opteron server, Tyan S4980 based. I'm currently using this (different) computer for my personal work. It's an old Mattel Barbie, that's been completely rebuilt with new components. I run Linux exclusively. Although I also use something called Wine (http://winehq.org) for Windows applications. I'm in the market to build a personal cluster, and these Pico products seem viable. I just wish that they would put something like these in an SSD format as well. The beautiful thing about SSDs, is that they can be installed in any existing 3.5" hotswap drive bay. And since there are many options available to put four SSDs in a single 5.25"/cdrom bay, you could easily build a cluster with these.
I currently have one of these: http://www.shopaddonics.com/itemdesc.asp?ic=ADPEXC
But something like this would be even better for what I'm talking about here.
http://www.shopaddonics.com/itemdesc.asp?ic=AE4RCS25NSA&eq=&Tp=
Using a system like this would allow you to have four of the Pico units in each cdrom bay.
http://www.shopaddonics.com/mmSHOPADDONICS/Images/ae4rcs25nsa.jpg
Or this would work too.
http://www.picocomputing.com/images/EC7BP%20Full.jpg
And since the SSD/2.5" drive is much larger than an expresscard, even more power could be packed into each one. Building a cluster would take only minutes to construct with tools like this.
http://www.supermicro.com/products/chassis/2U/213/SC213A-R900U.cfm
I thought I'd post this here, just in case anyone else like me finds this site as I did looking for similar information.
Shingoshi
vBulletin® v3.7.0, Copyright ©2000-2008, Jelsoft Enterprises Ltd.