PDA

View Full Version : New Benchmarking Application


Palit_Guy
01-31-2008, 02:48 PM
I've had some pretty good luck getting help from these forums already, http://forums.techpowerup.com/showthread.php?t=50997 so I thought I would take a chance and see if anyone can help me with something new.

I've been complaining for years about benchmarking software, just like everyone else. Everyone knows the arguments about 3DMarkXX, SiSoft etc. so I won't rehash them. But there is a problem with using Fraps or any in-game FPS monitor as well. The intentions are good but they just fall short of actually being useful.

I think that showing the max/min/avg framerates don't really tell you anything terribly useful about the gaming performance of a video card. Yes, you can get some kind of idea about whether or not it will run a game and a tiny glimpse at how well it MAY run the game.

But for a person interested in comparing the total gaming experience a video card, CPU or whatever will provide, max/min/avg is useless. So let me explain why I think that.

The max frame rate you get in a game is simply the one frame that was rendered the fastest. That is pretty easy to skew by looking down at the ground or at a wall close up. Personally, I don't care how fast a card can render a wall. Even if you don't do that, it still just tells you the fastest frame. That is hardly representative of how the whole game will play.

My problem with min framerate is exactly the same. The most complicated scene is going to render the worst. But if there aren't that many complicated scenes, the min rate loses its importance. The entire game isn't complicated so this only tells you about one particular part, again, not the whole game experience.

I've told this story many times and this is always the part where people say, ya, that's why we look at avg framerates. Well, this doesn't really tell you anything either. To make this simple let's use a smaller number of frames. If you have 10 frames you can consider the render time for each one, average them and see what I'm getting at.

If two frames render at 110 FPS, two frames render at 80 FPS, two frames render at 50 FPS, two frames render at 28 FPS and two frames render at 24 FPS the avg framerate would be 58.4 FPS. Your max and min rates would be 110 and 24 respectively. None of these numbers could be considered bad.

But if you look at the number of frames that were rendered below 30 FPS thereby looking like crap while playing you can see something interesting. 40% of the ten frames are junk.

Now obviously I constructed this example to be an example to demonstrate my point. The only time, in the real world, looking at max and min framerates makes sense is when both numbers are within the acceptable playing range of 25-50 FPS depending on who you ask.

But IMO that still doesn't give you the most accurate way to gauge the performance of a card or provide a very good way to compare two cards. If you really want to compare the gaming experience two cards provide you need to know something no benchmark or reviewer has ever told us. How many frames were rendered below a chosen threshold.

Think about this. It is quite possible for two cards to produce reasonable max/min/avg numbers yet one card produce more frame below 30 FPS than the other. This is a perceivable difference when playing the game. I would go so far as to say more people can tell when they are dropping frames then whether or not AA is set to 8x or, often, even on at all.

FRAPS records periodically what the framerate is. I remember years ago you could set the interval at which it looked but I don't see that you can do that any more. Some people have told me you can but they have yet to show me where.

Even so, can it check every frame? Or even every other frame? How many frames does it take to get to the center of a Tootsie Roll pop? How many i.e. what percentage of the total number of frames rendered do we need to consider before we can say with certainty how a card performs? How many frames is it ok to drop?

So, with that explanation, I'd like to get some feedback from people as to whether or not you get what I'm saying and whether or not you agree with me.

On top of that, and this is why I posted this here, I'd like to know if someone can create a tool or modify an existing one that can take these kinds of measurements. The more frames that get measured the better. I think the best way to say it would be that if you measure 80% of the frames you can be 80% sure of how the card performs.

So the three questions I'm asking are:

1. Does everyone understand what I'm getting at?
2. Does anyone disagree and why?
3. Can anyone create a tool that will measure performance as I've laid out?

philbrown23
01-31-2008, 04:01 PM
did you try the fur rendering bench from this site? it's pretty cool and at least it's not 3dmark lol but it is a small download and it is about 60 seconds of a benchmark. try it.

W1zzard
01-31-2008, 05:07 PM
fraps can log to a text file. from there just extract the data you want. the main problem with fraps is getting consistent results because every time you play there are slight differences. also the time (or your attendance) involved is several order of magnitudes higher than automated benchmarking

Hawk1
01-31-2008, 05:48 PM
While I agree with your arguments, it would definitely be time consuming, as W1zzard says. Not only creating/modifying the program, but running the benchmarks and then deciphering and presenting the data in a meaningful way for a proper review will be time consuming (let alone doing it for multiple card comparisons). It's a nice idea, and I'm no programmer by any stretch, but even if it were simple to program, would it gain acceptance in the community as a benchmark, but I guess thats the risk you take. If we were talking a completely different program that many would find useful that wasnt there before - ie- GPU-z - yes, it will fly and be adopted rather quickly. But to change the way benchmarks are done - you would need to get major support from the community. Look at 3Dmark - just about every site uses at least 3DM06 and usually 05 and prior to run VGA benchmarks, even thought the results are pretty meaningless (other than to guage which card can attain a world record easier).

It's a nice idea but I don't know if it would fly. Closest comparison is the way [H] does their graphics tests/comparisons vs. just about every other hardware site. Some will swear by their method, others say it's total BS, and I think, if something like what your proposing were brought in, it would be the same type of deal where one or two sites would adopt it and arguments would go back and forth among the "fanboys" as to the results.

Anyway, I may be way off base here, but thats my .02

Palit_Guy
01-31-2008, 06:15 PM
The problem with fraps is that it doesn't test enough frames. It only grabs about one per second and even then not at a real regular interval. So if you're hitting 60 FPS and it grabs one every second, you're only looking at 1/60th of the data.


Hawk1, without worrying about the amount of data or the ability to write the code or whether or not the world will accept the notion, do you agree or disagree with what I'm saying agbout how to measure game performance?

W1zzard, I'd like to know your thoughts on this as well.

If I'm not being clear on it let me know and I can say it a different way.

The bottom line is the most important measurement is how many frames in total are rendered below the desired rate. If you think 30 FPS is fast enough, fine. If you want to compare the game experience that two cards provide you can simply compare the number of frames rendered below 30 FPS. Whichever card has the fewest frames rendered below 30 FPS is the winner.

Hawk1
01-31-2008, 06:56 PM
The problem with fraps is that it doesn't test enough frames. It only grabs about one per second and even then not at a real regular interval. So if you're hitting 60 FPS and it grabs one every second, you're only looking at 1/60th of the data.


Hawk1, without worrying about the amount of data or the ability to write the code or whether or not the world will accept the notion, do you agree or disagree with what I'm saying agbout how to measure game performance?

W1zzard, I'd like to know your thoughts on this as well.

If I'm not being clear on it let me know and I can say it a different way.

The bottom line is the most important measurement is how many frames in total are rendered below the desired rate. If you think 30 FPS is fast enough, fine. If you want to compare the game experience that two cards provide you can simply compare the number of frames rendered below 30 FPS. Whichever card has the fewest frames rendered below 30 FPS is the winner.


I agree with the method, and it would be better than what we have now, but I would need to see the implementation of it. Would it be a cut scene each time to compare cards, because, like you said, actual game play will vary if your looking at walls or in a particularly heavily involved scene. It could also be used to skew the results if the exact same scenes are not rendered by the different cards in the comparison.

Definitely like the idea of it though, and would hope someone does it and it takes off.

Palit_Guy
01-31-2008, 07:08 PM
That's my point. If it's a good idea I'm sure there is someone somewhere that could write the code to make it work.

If enough people will take a look at this thread and say whether or not they agree with this line of thinking I would bring up the idea of a contest with W1zzard.

DaMulta
01-31-2008, 07:19 PM
So what you want to look at is kind of like this.

A game runs for 60 secs automatically so it's the same thing every time. (also helps the reviewers). Then add up every frame that it counted during that time?

That might work, BUT that would take a lot of time to burn into the consumer's head in IMO.

Palit_Guy
01-31-2008, 07:45 PM
I don't think it's reasonable to test a game for 60 seconds first of all. I'm thinking more like 60 minutes. My game time on my new QW:ET account is already over 55 hours and I can tell you that there is no 60 second period on any map that can represent the entire experience of the game.

How the test is done isn't what I'm trying to get at in this thread anyway. Whether or not everyone can wrap their brain around the idea is. I don't think you're getting it so let me try again.

No one is interested in how WELL a card plays a game. Not really. We say that but it really only applies to OCing. What people are really interested in is how BAD a card plays a game. Try to think about this from the gamer's point of view, not the enthusiast's.

Card A in Crysis has max/min/avg scores of 82/18/32. But of the 18,000 frames rendered during the test period, 1800 of them were below 30.

Card B in Crysis has max/min/avg scores of 75/15/31. But of the 18,000 frames rendered during the test period, 400 of them were below 30.

Which card would play Crysis better?

From a strictly OC standpoint, Card A would win because it achieved a higher max framerate. But during the time you played, Card A dropped frames 10% of the time while Card B dropped frames only 2% of the time.

So if you're interested in playing Crysis and having it be playable, Card B would be the best choice. However, if you were to read a review of those two cards, they would be using max/min/avg scores to gauge performance and would wind up recommending Card A.

So the decision you have to make is do you want to buy your card based on how well it plays a game(s) or on what max, min or avg framerate it gets no matter how crappy it plays the game.

Hawk1
01-31-2008, 08:03 PM
Your talking Alot of hours of testing ONE card in several different games, let alone if it is tried on xp vs. Vista for comparisons (and at different resolutions). Plus the effects of the max OC on that card, and then doing it all over with a few more cards for comparison. We could be talking a couple of weeks for a review to be complete. When new cards are released, I believe the hardware sites only have it for a few days/a week tops, prior to it hitting retail. People would want to know the cards performance the day of release, not a week or two after the fact.

So unless the card manufacturers give the cards in for review at least a couple of weeks prior to release, there would be no review for launch day. Sure you could post the review a week or so after release, but the website that does this (assuming the person doing the review has that kind of time - they dont all do reviews as a full time job), that website would loose out on valuable hits during the first day/two after NDA is lifted.

It's been 4 days since the 3870x2s have been released, and I havent looked at another review since day 2. So unless the website can do a reasonable comparison of cards/games, in a reasonable time (ie - by launch day), this will not fly.

again, my .02

Palit_Guy
01-31-2008, 08:32 PM
I feel you and agree, this procedure would be more labor intensive then things already are. So maybe this test isn't suitable for use in a launch day review. Maybe this comes as a follow on.

It isn't my job to tell a review site how to review a card. If they don't want to run this kind of test at all it's fine with me. I have yet to find a review site that produces more than a handful of results that I place any value in anyway. But that's neither here nor there. I think this is the most accurate way I've ever heard of (if it can be done) to determine how a card actually plays a game and what kind of experience a person will have using it.

I'm sorry if it takes a long time to run the test. It takes an equally long time to earn the $600 you spend on a super high-end card as well. It takes me exactly 60 hours to get 60 hours of game time in.

My interest in this is strictly for finding a way to ACCURATELY determine what experience a piece of hardware will provide in terms of framerate in an actual game. I don't care if no review sites use it.

Part of my responsibility as a marketing guy is to describe what our cards can do. I want to describe them accurately. If I put a note on my 8500GT page that it gets a max framerate of 54 in Crysis at 800x600 how many people do you think would by that card thinking they will be able to play Crysis? The min framerate it produces under certain circumstances will be the same as some significantly better cards and the average frame rate can be manipulated easily depending on which part of the game you test.

In my book that makes these values worth crap.

On the other hand, if you use a larger sample of the game and pay attention to the number of frames the card DOESN'T render you have a much better picture of what to expect. I think it might even be possible to record a demo or something of what you were doing in the game so people can tell if you did a legitimate test or not.

If you have a recording of what was going on during the game and have a pretty good chance of repeating the test and getting the same results. The longer the test the more likely the results are to be repeatable. I can't say as much for most of the benchmarks I see.

Hawk1
01-31-2008, 08:38 PM
I think the only way this would work, in any reasonable way, is if there was a database (say on TPU) where everone would download the program, do their own tests in a game, and then upload results, and they could then be sorted by VGA type/resolution/game etc. Even this would get complicated as you would have to factor in CPU used and what clock, probably incorporate CPUz/GPUz or something for proof, and someone would have to do graphs/charts to show comparisons of the different cards.

And dont get me wrong, I think its a great idea, and I would love some type of database to show this type of real performance, I just think it is/would get too complicated/time consuming for any one site to take on.

DaMulta
01-31-2008, 08:39 PM
Have you seen this http://www.techpowerup.com/reviews/HIS/HD_3870_X2/23.html

This is complied from all of the cards tested, and gives you a good idea on where that card stands. I understand where you are coming from, it would be nice to see benchmarks that run the game for hours then show the results. I think it would take the right person to find that could sit there for that amount of time for a follow up review.

I think the only way this would work, in any reasonable way, is if there was a database (say on TPU) where everone would download the program, do their own tests in a game, and then upload results, and they could then be sorted by VGA type/resolution/game etc. Even this would get complicated, and someone would have to do graphs/charts to show comparisons of the different cards.

And dont get me wrong, I think its a great idea, and I would love some type of database to show this type of real performance, I just think it is/would get too complicated/time consuming for any one site to take on.

That would be cool if you could upload the data into one giant pool.

Palit_Guy
02-01-2008, 12:43 AM
Ok, I got this all worked out with W1zzard. I think it makes sense but he's still dubious. I'm going to run some numbers, post them here and see what everyone thinks.

As it turns out, fraps does do what I needed, I was just looking for the csv in the wrong place.

tigger
02-01-2008, 12:56 AM
How do you get fraps to do what you need then?

I agree with what your saying,i'd take the card that has the lowest below 30fps count.

imperialreign
02-01-2008, 01:04 AM
So the three questions I'm asking are:

1. Does everyone understand what I'm getting at?
2. Does anyone disagree and why?
3. Can anyone create a tool that will measure performance as I've laid out?


I completely get it, and think many users will here, too. The only program I've ever run across with some form of "benchmark" that works similar to how you describe was part of F.E.A.R. Under the Options>Performance menus in this game, one would find all the configuration settings available and another option that allowed you to test your settings based on a pre-recorded cinematic from the game that covered just about all the effects you would run across. After the short test, it would report back your MAX FPS, MIN FPS, Avg FPS; and would also tell you what percentage of the test was below 20(?) FPS, 21-39 FPS and 40+ FPS.

But FEAR is old hat compared to new software.

I always thought this was kinda neat, as you could get a much better idea of how configuration changes would affect your overall gameplay experience, and also see those changes on screen during the test.




I'm all for some form of test that compiles information from many different users of said hardware, but, there would have to be some sort of way to sort out the effects of certain hardware on final results.

Say, with Crysis for example. If you have two near identical systems, and the only difference btween the two is the CPU. Say system 1 uses a Pentium 4, while system 2 is running a Core 2 Duo Extreme . . . in any benchmark, the C2DE system will score much higher numbers, whereas the slower P4 would pull those scores down.

If possible, it would be best to be able to seperate scoring out by similar hardware, instead of mixing Prescotts with Kentsfields, y'know?

JrRacinFan
02-01-2008, 01:05 AM
The problem with fraps is that it doesn't test enough frames. It only grabs about one per second and even then not at a real regular interval. So if you're hitting 60 FPS and it grabs one every second, you're only looking at 1/60th of the data.


Yes but then theoretically you would already then know the framerate. Alot of us here can grasp a feel of what our fps is like just by running the theoretical game.

imperialreign
02-01-2008, 01:38 AM
Yes but then theoretically you would already then know the framerate. Alot of us here can grasp a feel of what our fps is like just by running the theoretical game.

I think, though, that a lot of us here can get a good idea of how our system would run a game based on how everyone else runs it - we're so often comparing our equipment and test data and all :p


What I thought would be great use out of a database, would be to look at a video card and compare how it runs based on CPU groupings. Say, you currently have an 8600 GT and you're interested in an 8800 GTS, and your current CPU is an E6600 - you could look at the listings of 8800GTS' based on similar CPUs, and determine if your system would actually benefit more from the new GPU, or possibly deciding whether a new CPU would be better, instead, by seeing what types of scores people are logging with Exxxx CPUs compared to Qxxxx CPUs with that VGA adapter - would it be better to spend that $300 on a Q6600 CPU, or on a 8800 GTS, y'know?

JrRacinFan
02-01-2008, 01:43 AM
Oh of course Imperial, but I still don't understand. Another benchmark program? Why when futuremark basically does that anyways, you just have to read & research through the ORB.

Either way though, a database on TPU would be very nice, but W1zz is soooo busy as it is.
It's going to be hard to keep track unless he denotes a couple more mods just for regulating that database.

Mussels
02-01-2008, 02:09 AM
I was after this kind of benchie for ages when i was a reviwer - % under X is a great way to do it.

% Under 15 FPS, under 30 FPS and under 60FPS.

All we need is a program that shows that, as well as min/max/avg.

W1zz definately has the skills to make a program like this, but i doubt he has the time.

Hawk1
02-01-2008, 02:17 AM
So basically, we want to make a program like 3dmark (for its database results/comparison), but be able to run our own games for a minimum amount of time/frames. Well, I guess it would have to be done from scratch, as I don't think 3dmark is Open source. Also have to filter out the cheaters - someone with a P4 and 6800 standing in front of a wall for an hour in crysis will get 100fps, but without peaks/valleys would be obvious, but there would be subtler ways to cheat at it, so safegurds should be there and/or reviewing and clearing questionable results.

Sounds excellent, if it can be implemented properly. I anxiously await your numbers/idea Palit.

Mussels
02-01-2008, 02:20 AM
So basically, we want to make a program like 3dmark (for its database results/comparison), but be able to run our own games for a minimum amount of time/frames. Well, I guess it would have to be done from scratch, as I don't think 3dmark is Open source. Also have to filter out the cheaters - someone with a P4 and 6800 standing in front of a wall for an hour in crysis will get 100fps, but without peaks/valleys would be obvious, but there would be subtler ways to cheat at it, so safegurds should be there and/or reviewing and clearing questionable results.

Sounds excellent, if it can be implemented properly.

There are limitations: we would need to make our own timedemos to keep results fair and accurate - this is hard as not all games support timedemos.

What i think is best, is to use an opensource engine or get someone skilled to make one for us - then we have different tests with the option to download more.

This way its the same as 3dmark with the results determined differently, BUT over time we can simply and more complex 'tests' as games get more demanding. We cant use REAL games engines, but thats hard anyway - if the game doesnt support it OR THE USER DOESNT HAVE THE GAME, we lose a lot of benchmark capabilities.

Having the tests use their own engine really has its uses (imagine if 3dmark needed you to buy the latest 3D games to work, and you get the idea of how it would fail)

imperialreign
02-01-2008, 03:32 AM
There are limitations: we would need to make our own timedemos to keep results fair and accurate - this is hard as not all games support timedemos.

What i think is best, is to use an opensource engine or get someone skilled to make one for us - then we have different tests with the option to download more.

This way its the same as 3dmark with the results determined differently, BUT over time we can simply and more complex 'tests' as games get more demanding. We cant use REAL games engines, but thats hard anyway - if the game doesnt support it OR THE USER DOESNT HAVE THE GAME, we lose a lot of benchmark capabilities.

Having the tests use their own engine really has its uses (imagine if 3dmark needed you to buy the latest 3D games to work, and you get the idea of how it would fail)


that's where some included benchmarks from games come in - like Crysis, Doom 3 - pre built timedemos are the same, and fair for testing various systems.

But - even with that in place, you also have to worry about those that know the ins and outs of a game engine. It's very easy to run a timedemo in Doom3, and even easier to execute a ton of console commands that'll boost your performance 5-10% (at the cost of IQ and AQ), which renders the final test value innacurate. Same goes for Crysis in these regards. There isn't much of a way around it, except for choose games that don't have such access to the engine proper.

one aspect, though, that I really stoutly believe needs to be addressed is a proper, modern OpenGL test. We see some now and then pop up, but we have yet to see a game oriented style OGL benchmark within the last few years. The closest we had was running timedemos from Doom3. Everyone seems to neglect OGL, although the newer OGL 3.0 stands to start giving DX10 a run for it.

JrRacinFan
02-01-2008, 03:41 AM
Ahhhh I get it now after Hawk1's and Mussels' explanations. I wonder if it could be done a little like TPUBench is done and create plug-ins.

Oh heck no! W1zz already did all the dirty work! We just gotta find a darn good coder for the plug-ins!

Graogrim
02-01-2008, 07:27 AM
The best way would be to have it intimately integrated into the game engine. Or, if it were sophisticated enough, the game's scripting system might do. Simply record performance details for every single frame rendered--at a minimum the time to completion but ideally the particulars of what the engine was doing as well, like vertex counts, number of textures used, texture loads, shaders in use, paging, etc. With smart handling the overhead would be negligable. A few extra megabytes of RAM would be sufficient to track hours of gameplay.

Then, with the dataset at hand, all kinds of interesting statistical analysis could be performed. Perfect framerate graphs could be taken for granted. It would be easy to determine the performance cost of special effects, or illuminate the underlying causes of hitching, or see the exact benefit of additional memory. In short it would be an amazing profiling tool with the potential to give precise pointers to the best way to optimize performance.

Mussels
02-01-2008, 07:44 AM
The problem with building it into a game engine... we cant! we dont have access to that kinda thing.

i still think our own engine (like 3dmark has its own) is a good method. As long as we make different tests 'levels' and have a D3D9, D3D10 and OGL version, we have everything covered.

To begin we only need one level in each - make the 'level editor' open to all, and let the community make tests. If people want a raw shaders test, they can make one. massive textures? they can make one!

The advantage to this is that everyone can download these tests, and doesnt need people to have teh game. I mean what about crysis - how will these tests work if the game gets patched? you'd need to re-do most of the coding, and old results become useless. In the example i'm giving, it doesnt matter - yuou can re-make it and give it a different name, without invalidating old results.

Graogrim
02-01-2008, 08:10 AM
Ah, but there are some perfectly good game engines that have conveniently been made open-source. I.E. Quake 3.

Admittedly, I was thinking from the perspective of convincing a developer to include performance profiling as a game feature. I'd think that new game studio Futuremark is putting together would be a good candidate.

Palit_Guy
02-01-2008, 03:18 PM
So fraps will produce all the numbers you need. Place a tick in the frametimes box and it outputs the data on every frame to a csv. Every frame.

So if you play a game for 60 minutes with an average of 60 FPS you should have 216,000 frames. That's a lot of frames and, at least on FPS games, should cover a couple maps. Rather than trying to build something into the engine, I think it would be fine to just specify which maps to play, how long, number of players etc. It would be nice if there was a way to automate all the players without adding an additional load to the CPU but I don't see how to do that.

Anyway, with that large of a data set I'm guessing that most of the anomalies will get worked out. The only way to tell is to do that a few times and compare the results of several tests on the same card to see what kind of a range you get.

So if anyone would like to help out, here's what we need to do. Pick your favorite game and run fraps with the framerates box checked. Play for one hour. Open the csv and remove the frames that shouldn't count like a string of 0s or a string of really high ones that happen during map changes and such. Then delete however many rows you need to from the beginning or end so that the total number of rows represents 3,600 seconds of play time.

This part is a bit tricky. The csv shows the amount of time between frames in milliseconds. There are 3,600,000 ms in 3,600 seconds so you need to autosum the frametime column. Then you can delete rows until the total of the frametime column is 3,600,000. This will leave you with the total number of frames you actually rendered during one hour.

Theoretically speaking, two identical systems should render the same number of frames over the course of an hour on the same game at the same settings. There's only one way to find out if that's true or not and that's to run the tests.

So I'm going to start doing this on some of my systems and I'll let you know what I get. If any of you want to do the same thing I would really like to see your results as well.

Graogrim
02-01-2008, 05:16 PM
Fraps would be good for generating a framerate histogram of a really long standard demo, but if everyone did this for regular play I suspect the averages would vary a bit.

Palit_Guy
02-01-2008, 06:28 PM
That's what I'm interested in finding out. I think it's obvious that if you only run the game for 10 minutes or 5 minutes there would be a great deal of variance. So, in general, the less time you monitor framerates, the less reliable they are.

However, the longer you monitor them, the more reliable they become as you afford every tester the chance to experience the same in-game elements. I don't think it's possible to make the test period long enough to get identical results every time because what goes on during the game is never exactly the same.

What I would like to see is how much it really does vary. From that we can establish the variance and could express the findings with a +- degree of certainty.

So if you run this test ten times and get ten different results but they all seem to be within .01% of each other, that's a fairly reasonable degree of repeatability. .01% of 3,600,000 is 360 frames.

The only way to know how it's going to come out is to run the tests.

So I'm a little behind schedule on some stuff here. My sewer line is broken so I'm actually down in my basement with a jack hammer breaking up the concrete so I can fix it. We just moved here and this weekend is the last weekend we have to get completely moved out of the old house so I have to get that done as well. So it probably won't be until Monday or Tuesday until I can do some test runs.

So while I'm in the basement playing with 20 years of old poo, maybe some of you guys can help me out and run this test a few times. If you don't want to mess with the spread sheet, send it to me and I'll work with it.

W1zzard
02-01-2008, 07:58 PM
i'm looking into automating this using rivatuner fps logging and tpubench .. the required changes in rivatuner will take a few weeks though because unwinder is currently in a feature freeze for 2.07

Palit_Guy
02-01-2008, 08:29 PM
It's a nice idea but I don't know if it would fly. Closest comparison is the way [H] does their graphics tests/comparisons vs. just about every other hardware site. Some will swear by their method, others say it's total BS, and I think, if something like what your proposing were brought in, it would be the same type of deal where one or two sites would adopt it and arguments would go back and forth among the "fanboys" as to the results.

Anyway, I may be way off base here, but thats my .02

The "community" is the largest and most powerful group of people involved with the PC industry not counting Tier 1 manufacturers like Dell or HP etc. If the community gets together and says collectively that they want something, they will get it.

As for this test, if the results show that it makes sense it's very easy to get reviewers to use it. If every person on this board makes a single post at [H], anandtech etc. saying they want to see this test being run, I don't think they will say no.

If they were to say no, well, at least you would know what those sites think about you.

W1zzard
02-02-2008, 12:17 PM
it also matters to site owners how much time they have to invest into their benchmarks. it will be hard doing like a week of non stop benchmarking when you get the cards just a few days before launch

Graogrim
02-03-2008, 05:35 PM
However, the longer you monitor them, the more reliable they become as you afford every tester the chance to experience the same in-game elements.
The thing that concerns me is differences in playstyle. When I'm looking over someone's shoulder as they play...oh say Unreal Tournament for example, I'm also thinking about how I would be playing in their stead. I'd look a different way, focus in different places and on different targets, and maybe prioritize things like controlling key map elements differently.

So even over the course of an hour, on identical hardware I could potentially place a different load signature on the system than someone else. This is why I'm not convinced it's ok to just let people play and figure that things will even out.

imperialreign
02-03-2008, 09:30 PM
The thing that concerns me is differences in playstyle. When I'm looking over someone's shoulder as they play...oh say Unreal Tournament for example, I'm also thinking about how I would be playing in their stead. I'd look a different way, focus in different places and on different targets, and maybe prioritize things like controlling key map elements differently.

So even over the course of an hour, on identical hardware I could potentially place a different load signature on the system than someone else. This is why I'm not convinced it's ok to just let people play and figure that things will even out.

I kinda agree with that - I tend to go into sniper or stealth mode in a lot of games, even multiplayer, meaning that I like to slowly move through a mission being as thorough as possible . . . but I tend to find one spot I can sit and draw enemies towards me so that I can pick them off one by one instead of going hell-or-high-water and blazing through a mission John Wayne style.

The slower approach though, doesn't put as much of a stress on frame rates is what I'm trying to get at . . .

Mussels
02-04-2008, 02:48 AM
and i'm the gung ho who lures the enemy back into ambushes if it gets tough.

This is why i think looped demos are better, as you can always make a new loop later. making it a unique engine/design also makes it harder for people to cheat.

Palit_Guy
02-04-2008, 01:05 PM
Hmmmm. These are all very good points. I'll be very interested to see where the first tests come out.

I got moved out of the old house over the weekend but my sewer is FUBAR. A sewer guy is coming over today to take a look at it so I'm still kinda tied up.

Did anyone turn on fraps while they were playing this weekend?

Mussels
02-04-2008, 01:07 PM
Hmmmm. These are all very good points. I'll be very interested to see where the first tests come out.

I got moved out of the old house over the weekend but my sewer is FUBAR. A sewer guy is coming over today to take a look at it so I'm still kinda tied up.

Did anyone turn on fraps while they were playing this weekend?

what do you want us to do with fraps? i game a lot, but rarely use fraps.

Hawk1
02-04-2008, 01:12 PM
Yes, unfortunately, I was working all hours this weekend and will be swamped this week, but would like to take a stab at it next weekend, but need some guidance as to what you need from me as far as gaming/fraps etc.

warhammer
02-13-2008, 12:13 PM
It all sounds good if I can help with benching some of the games let me know.
As far as rivatuner goes I have not been able to get it to work, but I will reinstall fraps and give it a go for an hour or two of on line action (ETQW)..

Other games HALF LIFE 2, COD4,Q4 and CRYSIS

CH33T03S
02-13-2008, 04:27 PM
Ok I just played COD4 for an hour on 5 different maps and ran fraps for the who session and here is what I came up with.

Settings at max with AA at 4X and 1280 X 1024 res.

Frames [345487] Time (ms) [4114852] Min [2] Max [103] Avg [83.961]

System spec as follows:

Gigabyte GA-K8N Ultra SLI
Palit 8800 GT Super + 1GB
AMD Athlon 64 X2 3800+
2 GB AData DDR 400
Sound Blaster Audigy 2ZS Gamer
Wester Digital SATA 160GB

I still have the raw excel files if anyone would like to see them.