Balancing by Software
“maybe one day we can do a podcast about that and I can explain how that works.”
I’m so looking forward to that.
No, in that sentence ‘kill’ cannot easily be replaced with ‘play’.
What he’s saying is their QA department did play the event and they could see how well balanced it was from how much damage the did, how far along they got before the timer ran out etc. they just didn’t succeed in killing it. Which isn’t surprising. I doubt they employ 150 testers, so they wouldn’t be able to get a full map, which is what’s required to kill Tequatl.
Not playing it at all would be completely different.
“Life’s a journey, not a destination.”
That’s still a really bad design philosophy. I’ve assisted in balancing things for GW1, and I can attest to the value of actually testing and beating content with your testers. Sometimes something that seems alright on paper, is quite frustrating in practice, and needs to be toned down. You don’t do that after releasing the content, you do it before.
Case in point: The Reiko boss battle
http://wiki.guildwars.com/wiki/The_Final_Confrontation
After careful testing, it had to be scaled down a bit to make it more friendly to the average player. And still people struggled with it, which meant the difficulty was just right. But you can only figure that out by testing it with a lot of real people first.
(https://www.youtube.com/watch?v=D-On3Ya0_4Y)
Still, I’d like to see how testing happens behind the scenes.
That isn’t new by any means. They collect about pretty much everything we do. Where we are in which zone, how many fps we have, how much memory gw2.exe is using, which graphic setting we use. And also how much damage we do. This is a good way to track the consequences of changes but they wont be able to determine the causes of these consequences. They have to find the causes themselves, which they often fail to do. Or they aren’t able to fix the issues.
All in all, their datamining isn’t bad, in fact, it helps to determine bugs or malfunctions.
But if they aren’t able to reliably determine the cause of those bugs, which they aren’t currently, atleast in my mind, all their datamining wont mean much.
Still, I’d like to see how testing happens behind the scenes.
So would I.
I know a bit about software development and testing and I’ve alpha and beta tested 3rd party add-ons for other games, but I’d like to know more about how Anet does it specifically.
Hopefully that blog post will happen one day soon.
“Life’s a journey, not a destination.”
It is actually possible for balance by software to work out in theory. It establishes a good place to start then modify based upon player feedback. I would love the podcast/video/ready up to show how they do it. Who knows maybe players have feedback on the system (Besides the that is horrible way to balance.)
That isn’t new by any means. They collect about pretty much everything we do. Where we are in which zone, how many fps we have, how much memory gw2.exe is using, which graphic setting we use. And also how much damage we do. This is a good way to track the consequences of changes but they wont be able to determine the causes of these consequences. They have to find the causes themselves, which they often fail to do. Or they aren’t able to fix the issues.
All in all, their datamining isn’t bad, in fact, it helps to determine bugs or malfunctions.
But if they aren’t able to reliably determine the cause of those bugs, which they aren’t currently, atleast in my mind, all their datamining wont mean much.
I personally think that the timekeeping problem is the crux of many of the game bugs that exist. They started rewriting that code in the different sections of the coding last year when the Mists Bonuses weren’t being received by a large portion of the player base. They have since, on a regular basis, mentioned timekeeping in many threads related to bugs.
As far as bugs being reintroduced into the game after being fixed, well that shows a lack of QC oversight, no bug should ever be reintroduced after being fixed. They have not learned to learn by their mistakes to avoid making them repeatedly.
They still use a dedicated server to try to replicate bugs that players see in the live servers, and they can only replicate the minor bugs. They have not learned that they need live server conditions to see the bugs we see.
Just my humble opinion.
Software modeling and simulation are good tools for roughly dialing things in. For fine tuning, you need to do real testing. This explains quite a bit. Also, they probably don’t have a full 150 person test team, either.
So what you have is a small group of people, who know exactly what they’re supposed to do, who have perfect communication and cooperation. They’re either all in one big room, or have headsets. They’re all there to do their job, which is to run the event as specified in their test parameters. They’ve been fully briefed on the event mechanics, Do’s & Don’t, etc…
There are no players sitting there mindlessly spamming 1, there is no one elsewhere on the map gathering or doing another event. Everyone there is focused, well informed, and fully participating. And they don’t have enough to carry through to completion. So of course the results can be flawed. Sometimes very flawed. But they can counter a lot of that by listening and making changes based on observation and player feedback. But only if they do it quickly enough that event doesn’t become well know as something that isn’t fun/doesn’t work well. Otherwise that have to hype the fix in an update. They key is getting feedback from players(observation and player posts) as early as possible and acting on it quickly.
This explains sooo much… Everything finally makes sense now… (not about teq)
“Maybe I was the illusion all along!”
(edited by Daishi.6027)
I don’t know how many of you read the quite long interview with Chris Whiteside, ArenaNet Studio Design Director, which was published recently on relicsoforr. I did and I found a statement there, that is imho worth a thread and a discussion about.
Chris: (…) I don’t know how well known this is, but we didn’t kill Tequatl before Tequatl went live internally. That’s not how we do our reviews or our balance testing. We don’t need to kill it to know how well balanced it is, which is interesting and maybe one day we can do a podcast about that and I can explain how that works. (…)
The word kill easily could be replaced by play as killing is a major part of playing this game. Then it reads: “We don’t need to play it to know how well balanced it is”.
Because the software tells us whether it works or not.Now this explains pretty much everything, from ridiculous (but “working” ofc) trait requirements that nobody seemed to have noticed before the feature patch went live, to the growing gap between developers and players regarding the sense of equity, the relation between time spent and reward, the effort and its outcome, and so on.
Apparently balancing in case of Guild Wars 2 is not the expression of the will of designers or a well thought about result based on the experiences made by human testers, but the result of a brainless and emotionless software or process, which just tells no or go but cannot simulate human sentiment, perception and gameplay experience.
This leads to the assumption that we will see a lot more things that we don’t agree with in future, because overstated it’s a bot that decides what’s right or wrong, what’s fun or not :-/
Not sure what you’re getting at. Teq is currently well balance for the hardcore groups. Every now and then we’ll lose the fight. The only group of people that don’t agree is the group that Teq wasn’t made for.
@Smooth Penguin – It was my understanding(possibly wrong) that Tequatl, the Triple Wurm, Karka Queen, etc… were intended to be hardER content, requiring good communication and coordination, rather than hard CORE content, doable by only a very small percentage of players. Hard core players are a very small percentage of the player base.
But you miss the point. this thread is about how effective Anet’s testing is, not how capable you and your 5up3r l33t buds are.