Statistical evaluation of matchmaking

Statistical evaluation of matchmaking

in PvP

Posted by: capitalizm.2046

capitalizm.2046

I’m throwing some ideas out for testing match-making policies without putting thinking, feeling human beings through them. Maybe these ideas help someone at ArenaNet, maybe not, maybe that someone knows them already. In any case, here they are.

Player experience in structured PvP can be quantified by statistical metrics. For example, the average experience is a balanced, neutral mood if

  • losing streaks are short,
  • win-rate of most players are close to 50%, and
  • the variance of win-rate is small.

Conversely, if win-rate fluctuates a lot and most losing streaks are long, then people perceive match-making to be unfair. To evaluate a previously implemented match-making policy, ArenaNet can compute the relevant metrics from matches in the past. But it is hard to gather the metrics for new policies without trying them out. Anecdotal evidence suggests that the matchmaking policy of season 2 is not performing too well on variance of win-rate across time and on the average length of losing streaks.

The general idea is: Estimate the metrics of new policies by simulation.

  1. Build a model of player skill that predicts the outcomes of matches with reasonable accuracy.
  2. Simulate matches with the player-skill model and some candidate match-making policy.
  3. Compute the metrics from the simulation result.

(1) is the most difficult step because player skill is hard for the game to observe directly. Here’s a simple model: the team with the higher average win-rate is predicted to win. One can definitely build more sophisticated models using tools like hidden Markov models or Markov decision processes. Without the real data, it’s hard to say how accurate a model is; perhaps the sum of win-rates is a good enough predictor, who knows. To measure accuracy, one could build the model from half the data of season 1 and test it on the other half.

Parties queueing in together is a tricky part of (2). One can handle it with various degrees of sophistication.

  • Use previously formed parties.
  • Build a statistical model for parties based on previously formed parties.
  • Build a statistical model for parties based on statistical models for friends, guilds, time zones and so on.

Players gaming the system present a confounding factor. Again, one would have to look at the actual data to know whether these players make a statistically significant impact. If they do, assumptions must be changed to account for such behavior. For example, instead of “players play to win all the time”, the skill model may assume “players play to win only if their MMR is low enough”. (That is not the case for season 2, of course; the current match-making policy seems especially designed to make intentional losing unprofitable.)

Statistical evaluation of matchmaking

in PvP

Posted by: Barzhal.2640

Barzhal.2640

While I think what you propose would be great, I think you seriously overestimate Anet’s ability and inclination to do such a thing.

Statistical evaluation of matchmaking

in PvP

Posted by: Sorel.4870

Sorel.4870

Player experience in structured PvP can be quantified by statistical metrics. For example, the average experience is a balanced, neutral mood if

  • losing streaks are short,
  • win-rate of most players are close to 50%, and
  • the variance of win-rate is small.

Conversely, if win-rate fluctuates a lot and most losing streaks are long, then people perceive match-making to be unfair.

Unfortunately, even this simple allegation is not true. I tend to prefer the type of match making you describe, and I’m sure all the complains you see on the pvp subforum these days prefer it as well, but history has showed us that the player base is split on the issue.

In season 2, the match-maker tries to give you team mates based on your MMR, and foes based on your division. Which means if your MMR is higher than the average in your division, you’ll be on a winning streak. Conversely, if your MMR is lower than average, you’ll be on a losing streak.

Last season, Anet used the Glicko 2 algorithm in order to give fair matches to everyone. And, to a certain point, it worked. But since the league is a progression system, you can’t have fair games from the beginning: how could you reach higher divisions then? That’s when people start abusing. You had dozens of players complaining in these very forums about being stuck with a 50% ratio.

Other games with league ladders use an unfair match-making algorithm to have better players win a lot more than 50% of their games and climb faster, like Hearthstone for example. Both systems have merit, but you will always find people to complain.

One possible way to improve the current system would be to not have everyone starting from amber again in the next season, but rather drop only a division. This way, it would reduce the “traffic jam” effect of early season.

Statistical evaluation of matchmaking

in PvP

Posted by: Torafugu.1087

Torafugu.1087

It’s possible to run a computer simulation of the matchmaking system to estimate its capabilities.

First, create models of players with hidden, randomly generated true skill ratings and variance.
Run these models through a series of simulated “matches” where the chance of winning is estimated using the true ratings + random numbers according to true variance.
Compare the rating estimated by the MM with the true values and determine the accuracy of the matchmaking system.

Statistical evaluation of matchmaking

in PvP

Posted by: Crinn.7864

Crinn.7864

While I think what you propose would be great, I think you seriously overestimate Anet’s ability and inclination to do such a thing.

I feel confidant in saying that Anet likely already has done modeling similar to what the OP is talking about, heck Anet probably has way more advanced modeling metrics than what is outlined above.

Players seriously underestimate the amount of metrics developers have.

Also simulating player skill especially at high levels of play isn’t actually as complex as people make it out to be. Figuring out the optimal play for a given scenario is fairly straightforward. Most of player skill is actually in a person’s ability to process and react to the situation.

The problem with metrics and simulations is that they only give you the typical/average result, however the average result is not the same as the live server result.

Example: Lets say we have a hypothetical class that half of the time completely dominates their opponent, and the other half of the time gets destroyed in seconds.
Simulations and metrics would say that the class balanced since it has roughly 50/50 success.
However a competitive player would consider the class to be terrible because having a 50/50 chance to fall on your face is unacceptable for serious competition. It’s simply too risky.

Sanity is for the weak minded.
YouTube

Statistical evaluation of matchmaking

in PvP

Posted by: Shikago.4915

Shikago.4915

I think the idea to split up the pips was brilliant. While I did not make it too far last season, this season, I am just going to play WoW until April. I just lost my 10th match pvp in a row. Never happened before. My time is worth more than the aggravation that has now been plopped on my plate. Just annoyed as all get out.

Statistical evaluation of matchmaking

in PvP

Posted by: Jourdelune.7456

Jourdelune.7456

Any skill based system is greater than grind based system for serious PVP players.

It’s important to cater to the grinders and to the serious pvp players. Actually, I think the Season 2 alienate both camp.

So, we need to get RANK based on skills and REWARD based on GRINDS.

Merging BOTH is stupid and is alienating BOTH camp.

Remember… that working his MMR was somewhat GRINDY. You are not reward to do 2k games… but your MMR should be higher after your 2k than it was when we had 15. But the grind didn’t gave anything.

The LEGENDARY ITEM GRIND is NEED but not to compromised the SKILL based rank system (and Match-Making). Everyone want FAIR matches for their LEVEL of play.

Dal Aï Lhama (Tempest), Dal Lahu Akbar (DH), Lord Dhal of Dharma (Scrapper) 12k+ spvp games.
Former Team Captain of ggwp (ESL weekly), GLHF (AG), MIST[CORE] spvp alliance guild.
https://www.reddit.com/r/GuildWars2PvPTeams/

Statistical evaluation of matchmaking

in PvP

Posted by: Aenesthesia.1697

Aenesthesia.1697

What you people don’t seem to understand, is that your vision of a fair, competitive league, is anything but fair.

If i am a pro, it’s VERY unfair that i have to face only pros to get out of the trash divisions. If i am better than the average player in my division, i should win most of the matches, because, heck, i am better! And if i am worse than the average player in my division, then i will lose most of the matches, because… am i worse or not?

What we need, is a better match making for unranked. That way, people that want fair matches for every level, can play unranked and leave the ranked people alone. And no, unranked shouldn’t have access to the same rewards as ranked. I will never get the legenday pvp wings and i could care less. But there has to be a reward for better than average players.

Statistical evaluation of matchmaking

in PvP

Posted by: Torafugu.1087

Torafugu.1087

1 v 1 best of 5 would be the most fair and most skill dependent game mode.
Plus it’s the most manly. Two men enter, one leave.