Hardware usage drops in crowded zones along with fps

Hardware usage drops in crowded zones along with fps

in Account & Technical Support

Posted by: Noktrin.8051

Noktrin.8051

So, i was looking at this for a while, but never could figure anything out. I get very good fps usually, but i get massive drops in WvW. That seems normal, the issue I have with this is that while my graphic cards are usually at 60-75% usage in normal environments the more crowded the area gets, the lower the usage becomes! I’ve been in WvW fights where i’m getting 20 fps and my gpus are sitting under 40% and my cpu at about the same. I attached a screenshot to show this as well.

I thought it might be a ram issue, but the hit rate is perfect, the number of page faults is very low (<10) which makes sense since i have plenty (8GB). I’m running out of ideas.

Specs:

i5 OCed to 4.6GHZ
2x gtx 460 sli
8GB ram
Win 7 64bit

Newest drivers

Attachments:

Hardware usage drops in crowded zones along with fps

in Account & Technical Support

Posted by: deltaconnected.4058

deltaconnected.4058

http://i.imgur.com/uiBK8.jpg, from https://forum-en.gw2archive.eu/forum/support/tech/Does-this-make-any-sense-CPU-usage/first#post144758

GW2.exe process: 40% usage.
The rendering/dispatch thread: 23% usage.

If they haven’t found a way to simplify the calculations or move stuff out of the main thread by now, there’s a very good chance this is how it’ll be. Every MMO’s been like this, even WoW at launch.

Hardware usage drops in crowded zones along with fps

in Account & Technical Support

Posted by: Noktrin.8051

Noktrin.8051

I noticed that as well, issue is this. If say the main thread was doing too much, i would expect to see the core the thread is running on to be maxed out and the issue being other threads hanging while waiting for it. That doesn’t seem to be case though, it seems to me rather to be an issue how object, players and effects are implemented into the engine and how these are being rendered.

This is pure speculation as i have no knowledge on implementation but there clearly seems to be a bottle neck in the code on how it renders and how much “parallelism” it has when it comes to multiple objects. In other words, it gets behind and starts playing a catch up game with the other threads while waiting for locks or whatever the case may be, this is restricting it from taking full advantage of the hardware.

The best model i can up with is say 10 slave threads having to do parts of the render, while not swamped, say in pve, these can manage fine. Once they finish their job the main thread can take the data and use it. In WvW where many things are happening, although i noticed the game starts to show less effects and reduce details and players on screen, there is much more than needs to be done so the queue for the slave threads becomes larger while they don’t properly scale in size, so what ends up happening is that the main thread has to wait too long. This would explain why hardware usage goes down, I would imagine the main thread to make the bulk of request to the gpu but if it’s waiting for other threads to finish their job and those are taking too long it’s just completing useless cycles on the cpu while waiting and thus not using the gpu.

That’s my kitteny theory.

(edited by Noktrin.8051)

Hardware usage drops in crowded zones along with fps

in Account & Technical Support

Posted by: Noktrin.8051

Noktrin.8051

as an addendum, if i remember correctly from that one class i took way back when that mentioned GPU architecture vs CPU, GPUs are much more powerful in the way of parallelism, rather than having a powerful core, they have multiple weaker ones but they can take care of many more things in parallel which is why they’re great for graphics rather than raw processing. If the engine was not implemented to take advantage of that for certain things and leaves the cpu to do it, there’s your bottleneck. This would also become extremely apparent in large battles: graphics hardware usage going down would be the sign for it as it has to wait for cpu.

if there are 20 things that need to be done in parallel, and complete before the next thing could be done, a gpu could potentially get them done without any switches from the kernel all in one cycle while the cpu would need many more. This would increase the overhead and lower throughput. CMPT 300 baby.

(edited by Noktrin.8051)

Hardware usage drops in crowded zones along with fps

in Account & Technical Support

Posted by: deltaconnected.4058

deltaconnected.4058

I noticed that as well, issue is this. If say the main thread was doing too much, i would expect to see the core the thread is running on to be maxed out and the issue being other threads hanging while waiting for it. That doesn’t seem to be case though, it seems to me rather to be an issue how object, players and effects are implemented into the engine and how these are being rendered.

That’s just how Windows’ scheduler works – distributed load. Single thread will still be limited to a single core at any one point in time. How much a context switch costs in performance, I have no idea as I haven’t been able to find anything in Windows to change that behavior.

The other threads don’t need locks as good design will make these uni-directional. Eg, the thread that pulls player character location/model from the server only has to write the data to shared memory while the main will only read it. Same with sound; the main thread requests a sound event and the sound thread will queue/play it as needed.

The best model i can up with is say 10 slave threads having to do parts of the render, while not swamped, say in pve, these can manage fine. Once they finish their job the main thread can take the data and use it. In WvW where many things are happening, although i noticed the game starts to show less effects and reduce details and players on screen

The result of rotating something around a world axis and then moving it away from the center point would not be the same if you moved first then rotated. As soon as that becomes a possibility, the operations have to be done in series. Splitting it up would be very complex to say the least (maybe could be done by terrain triangle plane/player/static objects, but I can’t see this working in the cities).
Can’t comment on the effects reduction as I haven’t seen it happen, but there’s nothing preventing it from reducing particle count if previous frame render time > x milliseconds.

If the engine was not implemented to take advantage of that for certain things and leaves the cpu to do it, there’s your bottleneck.

I’m not a game dev so I can’t say for sure, but with what little DirectX I know, the pipeline is designed in such a way that it can’t be done. BeginScene, use your routines that draw a bunch of triangles, EndScene. No way for the CPU to interact directly with the display.

if there are 20 things that need to be done in parallel, and complete before the next thing could be done, a gpu could potentially get them done without any switches from the kernel all in one cycle while the cpu would need many more.

Until those 20 things do FP32 math. Then they’d never get done

Hardware usage drops in crowded zones along with fps

in Account & Technical Support

Posted by: starlinvf.1358

starlinvf.1358

I’m wondering if this has to do with the data feed handling and interpolation since with more activity, it takes longer to send the information needed for one scene update.

I’ve been noticing this same trend in a couple other recent games I was beta testing for, and the idea that the main thread starts taking longer and having to wait on more data as the number of actors increases would make sense.