Guild Wars 2 Forum

A serious examination of S2 MMR(Math Inside)

Glicko does not give consistent results when used to rate participants in an unbalanced pool.

For a simple example:
Let’s consider a round robin chess tournament played infinitely many times between 2 average players and 18 Grandmasters. The average player will never win vs a grandmaster and will win half the time against the other average player. (Hint: This is exactly like being in the bottom 10% or so of a gem tier.) For an accurate result the average players ratings would vary slightly but remain generally in the same area. Instead though, the average players will have rating that trends down due to the minimum rating deduction for a loss. Eventually both average players would have a rating equivalent to zero.

ELO and GLICKO when used in other games are used both as a skill rating and a matchmaking tool. This generates consistent results.

As many forum posters have noted, other games successfully use ELO/GLICKO and some have even attempted to invalidate these systems by the team or class based balance system of GW2.

If we look at the previous example, the average players would never play against the grandmasters and would maintain an appropriate rating.

Decoupling MMR and Matchmaking generates this unbalanced pool. (Other games do not decouple the two)

In a fair system for player rating (Note: I didn’t say fair matchmaking, just fair player rating) you cannot have the unbalanced pool of potential matches otherwise the rating of players will tend to trend to the extremes. As in the previous example: the grandmasters would trend upward and the average players downward. This would then give all players in the tournament an inaccurate rating.

Amber is irrelevant, and with sufficient play sapphire and emerald will become irrelevant as well. This groups a large majority of the player base in ruby

Even with a very small win chance, the no pip loss in amber produces the inevitable result that with enough games played a player will leave amber. The tier loss in emerald and sapphire mitigate this slightly but are damaged by the stop loss pip and the streak pip. It only takes three wins to complete a 5 pip tier. (3 win pips, 1 stop loss pip, 1 streak pip) That tier cannot be lost and even with a low win chance 3 wins is far far far more statistically likely than 5.

Players with a very low win chance may remain in emerald or sapphire for a long period of time but the odds of this decrease with games played. So given enough time all but the unluckiest/worst win chance players will leave these tiers.

This doesn’t give a ruby tier where average players attain and stay in ruby. It makes ruby the inevitable destination of all players with sufficient time to play.

…continued….

A serious examination of S2 MMR(Math Inside)

in PvP

Posted by: Tenebria.7239

Tenebria.7239

Part 2 – Why ELO Hell isn’t real, but Ruby Hell is

Many players believe (correctly or incorrectly) that their MMR doesn’t accurately represent the ‘true’ MMR of their skill. (AKA ELO Hell)
In a game using MMR for matchmaking you have two numbers:
x-the average rating of that players team.
y-the average rating of the other team.
So if z is the actual skill of that player then he should generally win games where x+(z-x)/5>y.
So if x and y are held close eventually that player should win enough so that his/her rating increases closer to z. (Hint: most other competitive online games try to hold x and y close) (Hint: this math also works if the player is worse than they think)

The issue occurs when x and y are not held close. As noted above you wind up reducing the difference between a players ‘true MMR’ and ‘system MMR’ by a factor of 5 or so (changes depending on what type of system is used to generate the average but 5 suffices for an example.) If z is a reasonably average or slightly better than average for the tier (what happens if it isn’t I’ll cover later) and x is in the bottom percentage of a tier then x+(z-x)/5 is also likely in the bottom half of the tier and quite possibly in an even still lower fraction of that tier.
So that player is still going to lose a majority of his matches and his system MMR will continue to decline even if his true MMR was higher. (Note, this wouldn’t happen if the player could exit the matchmaking tier, but he can’t because of safeguards.)
As that system MMR declines, teammate quality will also decline under the current system.

Why ‘get gud’ or ‘I have three legendaries on ftp accounts’ works
If we look at the ‘true MMR’ z, obviously there is a point where z is high enough that x+(z-x)/5 is above the average for a tier. So a player can be good enough to avoid/counter the ‘death spiral’ but a slightly better than average player won’t be able to.

Part 3- Why the volatility won’t go away
Somewhat opinion-somewhat fact

We have a 15 pip range for matchmaking and the very best of the best are somewhere near the top. (Note: ignoring the fact that some of them are throwing matches to be at low legendary for faster queues)
The top is artificially capped by prestige legendary not counting for matchmaking. This creates downward pressure to push more players into the diamond ruby range. (Not saying that it’s a bad thing, or legendary shouldn’t be prestigious) but it does mean a majority of players are grouped into the limited number of pips available in diamond and ruby and chunked together by 15 pip ranges.
This lacks the granularity and accuracy of an actual rating system even if only pips were used to match.

-Tene

TLDR: Matchmaking and rating must be coupled in order for MMR to be accurate.
TLDR2:‘GET GUD’ enough and you can battle out of it, but you’ll have to be far better than you would have to be if it were accurate.
P.S. I personally blame this on trying to have a reward system that tries to combine skill based and grind based and does it poorly.

(edited 2016-03-18 18:15:20 by Tenebria.7239)

A serious examination of S2 MMR(Math Inside)

in PvP

Posted by: Ben Ken Jamin.9728

Ben Ken Jamin.9728

I agree, The need to find a way better balance

A serious examination of S2 MMR(Math Inside)

in PvP

Posted by: Raek.8504

Raek.8504

Yep, finally someone who is not potato. In terms of “GET GUD” in some cases it’s equivalent of being “slightly” better then ESL player in terms of carrying.

(edited 2016-03-20 18:25:04 by Raek.8504)

A serious examination of S2 MMR(Math Inside)

in PvP

Posted by: Deniara Devious.3948

Deniara Devious.3948

Makes a lot of sense. Arenanet also made a big mistake by changing the pip gain/loss system from Season 1. I think the final score should always matter. In other words if you manage to get a pretty close lose 400-500 against a higher MMR team, you should not a lose a pip or might even gain a pip, while that high MMR team loses a pip.

The current system is bad because 1-500 or 499-500 losses are both treated equally when it comes to pip gain/loss. Too many players stop playing after their team is losing by a large margin. Thus even more games become blow outs. The abundance of blow out games drives off a large percentage of competitive gamers.

Deniara / Ayna – I want the original WvWvW maps back – Desolation [EU]