Potential Matchups 6-21

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

I had my computer roll a bunch of virtual dice and these came out. Based off the current scores for this week about an hour before this post, so they may slide around a bit before reset but it’s late enough in the week that the score ratios likely won’t change much.

Some notes on the image and methodology:
-Doesn’t predict what color anyone will be. So every permutation of colors for 3 servers on the same tier is grouped together (the alternative had me getting 1999983 unique matchups out of 2000000 rolls)
-The rankings/tiers given are the result of the randomized matchups. The graphs show the likelihood of a server being “ranked” at a given rank after randomization.
-Right side shows the 27 most likely matchups. I have every matchup that came out saved in a file, broken down by server, if anyone wants to see the results for their server let me know.
-Two million rolls is about as much as I can do in a single run, and it’s not currently set up to easily aggregate multiple runs. I may, in the future, do this, so the results can become more averaged.
-NA Only for now, I can do EU too if there’s demand.

Some general notes on matchups:
-The average deviation is decreasing. After last week, the average was 183.3531, this week it’s 177.2590 (estimated). This makes the likelihood of servers getting matchups far outside their ranking less.
-The decreased deviation is making the T4-T5 gap more significant. Basically there are two supertiers on NA right now, 1-12 and 13-24. This is most significant for servers close to the gap: Ehmry Bay (est. Rank 12 after this week) has less than a 10% chance of being placed in a matchup with T5 and lower servers, and Borlis Pass (est. Rank 13) has about a 10% chance of matching with T4 and above.
-Running the numbers from the end of last week, the T3 matchup for this week had a 0.4% chance of happening. The exact same matchup could’ve happened as T2 with an 0.8% chance, or even as T4 with an 0.002% chance. But, if the dice are feeling saucy, basically anything can happen.
-Well, not anything. Servers are still limited by their volatility in how far they can “move” via randomization. Sanctum of Rall (est. T1 after this week, ranking 2194.2, deviation 176.4) could not possibly roll lower than 2017.7, while Sea of Sorrows couldn’t roll higher than 1982.6 (est. T7, 1797.5, 185.1) so they could never actually match up.

Edit: Updated with corrected (+40) deviations and both NA and EU graphs!

Edit 2: I think I’ve worked out all the bugs. Fixed charts attached.

Attachments:

(edited by Xerol.1578)

Potential Matchups 6-21

in Match-ups

Posted by: Nubu.6148

Nubu.6148

Pls Eu too ! ^^

Nubú -Engie -Asura-
BNF-Bitte nicht füttern-
Smallscale <3 !

Potential Matchups 6-21

in Match-ups

Posted by: warboss.1362

warboss.1362

EU please
thank you

Potential Matchups 6-21

in Match-ups

Posted by: Dayra.7405

Dayra.7405

Sanctum of Rall (est. T1 after this week, ranking 2194.2, deviation 176.4) could not possibly roll lower than 2017.7, while Sea of Sorrows couldn’t roll higher than 1982.6 (est. T7, 1797.5, 185.1) so they could never actually match up.

That’s wrong and it’s in several of your possible matches as well, e.g.
SoR – BG – SoS 0.37%
SoR – TC – SoS 0.09%
SoR – DB – SoS 0.05%
SoR – JQ – SoS 0.04%

Here is a simple example showing how it can happen:
https://forum-en.gw2archive.eu/forum/wuv/wuv/Server-Match-up-is-TERRIBLE/2245792

The basic idea is: they do not have to overlap, it just must be possible that mostly one is left in between (and everyone else is able to “roll-out”). And of course match-alignment must fit.

Ceterum censeo SFR esse delendam!

(edited by Dayra.7405)

Potential Matchups 6-21

in Match-ups

Posted by: Le Rooster.8715

Le Rooster.8715

Sanctum of Rall (est. T1 after this week, ranking 2194.2, deviation 176.4) could not possibly roll lower than 2017.7, while Sea of Sorrows couldn’t roll higher than 1982.6 (est. T7, 1797.5, 185.1) so they could never actually match up.

That’s wrong and it’s in several of your possible matches as well, e.g.
SoR – BG – SoS 0.37%
SoR – TC – SoS 0.09%
SoR – DB – SoS 0.05%
SoR – JQ – SoS 0.04%

With Sea of Sorrows’s luck SoR BG SoS will happen.

Roosters Inc-Team Shatter [TS] Commander
Sea of Sorrows http://www.gw2sos.com/index.php

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

Just realised why that’s wrong. SoS couldn’t get randranked higher than SoR, but they could be in the same “bucket”. Still, it’s highly unlikely. Actually figuring out which matchups can’t happen is a bit of a pain since it also depends on what the other 22 servers roll. It’s very unlikely, though.

Working on EU right now.

Potential Matchups 6-21

in Match-ups

Posted by: jaimy.4108

jaimy.4108

Thanks for getting one for EU! Very curious =)

VoTF

Potential Matchups 6-21

in Match-ups

Posted by: The Holy Eldar.3624

The Holy Eldar.3624

Working on EU right now.

Awesome!

Acan Stoneheart
Immortal Kingdom [KING] – Officer
Second Law [Scnd] Filthy Casual

Potential Matchups 6-21

in Match-ups

Posted by: Grieve.1432

Grieve.1432

So does anyone remember when ANet adding 600 rating (300 each) to two of the tier 8 NA servers as a “temporary” fix. I wonder. If they were to take it out from the top tier NA, would that shuffle the ratings down enough to help close the gap between tier 4 and tier 5 making those matchups more likely as they probably should.

South of Heaven [SoH] – Crystal Desert
Grieve Logdan (Human War) | Rifte Torin (Charr Thief)
Feylicia Logdan (Human Mes) | Elias Foralli (Asura Guard)

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

Here’s EU. Enjoy.

Attachments:

Potential Matchups 6-21

in Match-ups

Posted by: Feed Me Change.6528

Feed Me Change.6528

Sanctum of Rall (est. T1 after this week, ranking 2194.2, deviation 176.4) could not possibly roll lower than 2017.7, while Sea of Sorrows couldn’t roll higher than 1982.6 (est. T7, 1797.5, 185.1) so they could never actually match up.

That’s wrong and it’s in several of your possible matches as well, e.g.
SoR – BG – SoS 0.37%
SoR – TC – SoS 0.09%
SoR – DB – SoS 0.05%
SoR – JQ – SoS 0.04%

With Sea of Sorrows’s luck SoR BG SoS will happen.

When we roll that.. I’m going to kick you out of SoS..

NSP>ET>SoS>BG>ET>SoS>JQ>SoS>Mag>JQ
My fun laughs at your server pride.

Potential Matchups 6-21

in Match-ups

Posted by: Snowreap.5174

Snowreap.5174

moar data! I simulated 100 million matchups for NA and EU based on the anticipated ratings that would result from the current scores.

for each region there are 3 files. #1 shows the probability of getting a particular server as one of your two opponents. #2 shows the probability of getting any particular pair of opponents (with different color assignments considered the same matchup). #3 shows the probability of getting any particular matchup (with different color assignments considered different matchups).

if you’d like to run these numbers yourself for larger or smaller sample sizes, the software used has been posted here:
https://forum-en.gw2archive.eu/forum/community/api/Simple-C-Example-Rating-Calculation/2154273

-ken

The Purge [PURG] – Ehmry Bay

Potential Matchups 6-21

in Match-ups

Posted by: Fuzzion.2504

Fuzzion.2504

Awesome job mate!

Fuzzionx [SF]
Guest member of [LOVE]
JQ official Prime Minister

Potential Matchups 6-21

in Match-ups

Posted by: renmei.3102

renmei.3102

Am I the only person who scrolled down to SoS? Hopefully they finally get an even match-up next week :p

Potential Matchups 6-21

in Match-ups

Posted by: Snowreap.5174

Snowreap.5174

Xerol, my numbers are very different from yours. would you mind posting the code you used to calculate base ratings and the code you used to calculate matchup ratings?

I’m thinking that either your code or my code (or possibly both) are wrong and I’d like to track down why so that we both produce similar results.

attached are the base ratings I used for my matchup calculations; they are predicted ratings based on live scores from a couple of hours ago.

-ken

The Purge [PURG] – Ehmry Bay

(edited by Snowreap.5174)

Potential Matchups 6-21

in Match-ups

Posted by: Feed Me Change.6528

Feed Me Change.6528

Am I the only person who scrolled down to SoS? Hopefully they finally get an even match-up next week :p

Nope. seeing TC/DB as our top match-up makes me wanna cry, a full 1.2% higher than any other.

EDIT: Anyone else find it weird that on the NA side, #1v#2v#3 has almost a 40% chance of happening, while EU (same matchup) is only at 9%? Is this because BBay moved up 3 spots this week? And 5 servers (#3 to #7) are within 100 rating of each other?

NSP>ET>SoS>BG>ET>SoS>JQ>SoS>Mag>JQ
My fun laughs at your server pride.

(edited by Feed Me Change.6528)

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

Xerol, my numbers are very different from yours. would you mind posting the code you used to calculate base ratings and the code you used to calculate matchup ratings?

I’m thinking that either your code or my code (or possibly both) are wrong and I’d like to track down why so that we both produce similar results.

attached are the base ratings I used for my matchup calculations; they are predicted ratings based on live scores from a couple of hours ago.

-ken

My code is terribly written and barely even readable by myself so I’ll just explain the process.

I used numbers from millenium with my single match calculator (which shows the new deviation/volatility scores) to get the new ratings and deviation for the next week. The ratings I got agreed with what millenium had within 0.001, and I’ve checked this tool against posted ratings with past scores to verify that it is accurate. For the record, here’s the resulting data I had from the NA servers as of 7am ET today:


1,Sanctum of Rall,2194.217,176.434,0.741,1
2,Blackgate,2181.708,170.258,0.736,1
3,Jade Quarry,2117.046,170.594,0.735,1
4,Tarnished Coast,2013.035,182.921,0.739,2
5,Dragonbrand,1956.766,178.986,0.741,3
6,Fort Aspenwood,1875.687,178.834,0.738,3
7,Sea of Sorrows,1797.509,185.074,0.756,2
8,Maguuma,1769.354,182.97,0.752,3
9,Yak’s Bend,1690.237,170.607,0.737,4
10,Kaineng,1688.495,172.55,0.767,4
11,Crystal Desert,1681.059,179.546,0.747,2
12,Ehmry Bay,1655.469,181.671,0.764,4
13,Borlis Pass,1395.871,179.659,0.742,5
14,Stormbluff Isle,1394.844,186.433,0.764,5
15,Anvil Rock,1350.646,176.75,0.737,5
16,Darkhaven,1232.168,172.042,0.743,6
17,Isle of Janthir,1217.504,179.222,0.763,6
18,Gate of Madness,1159.356,175.038,0.74,7
19,Northern Shiverpeaks,1155.672,173.548,0.735,6
20,Sorrow’s Furnace,1113.432,175.11,0.765,7
21,Henge of Denravi,1087.719,177.439,0.74,8
22,Devona’s Rest,1080.806,176.863,0.746,7
23,Ferguson’s Crossing,894.601,174.808,0.749,8
24,Eredon Terrace,851.438,176.86,0.747,8

Then, for 2 million iterations, my program generates a random number from -1 to 1, multiplies it by the (new!) deviation, and adds it to the server’s (new!) rating. (And this may be where I’m wrong, I don’t think they’ve divulged the exact method of randomizing ratings; they only gave this method as an example.) This gives me 2 million lists of servers, which are then each sorted and then I start to derive tiers and matchups from those.

For each roll, and for each server, I determine which other servers are in the same tier. For memory efficiency I just use a bitfield for this, so if a particular roll comes up with maguuma (rank in 7th, DB (rank 6) in 8th, and kaineng (rank 10) in 9th (after the random roll) the bitfield would look like 000000000000001010100000_2 (672 decimal). A matchup where maguuma rolled 8th and DB rolled 7th would look EXACTLY THE SAME, so it’s color-agnostic. This gives me a unique number for a particular matchup on a particular tier, which is why my results will have the same matchup showing up in different tiers.

For each roll, these numbers are stored, and later counted up to get the most common matchups for each server. To easily pull out the matchups by server for making the graphs, each server stores its matchup bitfield for each roll, which means a bit of duplication, which is where my application gets memory-hungry and why I can only do about ~2.4 million rolls at most at a time. I suppose I could discard most of the data after doing a particular roll but I’d probably have to rewrite the program from scratch to do this, since it’s organized in the stepwise fashion detailed above.

The variation could just be because we started with different numbers, or because we used different randomization methods, or because my sample size is much smaller.

(edited by Xerol.1578)

Potential Matchups 6-21

in Match-ups

Posted by: Murderous Clown.9723

Murderous Clown.9723

Are you adding 40 to the deviation before multiplying it by the random number?

Jimibabob – Valkyries of Dwayna [VoD]
Piken Square

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

Somehow I completely overlooked that in their explanation post. Rerunning things now.

Of course there’s no indication they’re actually using 1.0 and 40 as the parameters, they just gave those as an example. If a dev could confirm or deny it would be very helpful.

edit: Fixed & Updated. See first post.

(edited by Xerol.1578)

Potential Matchups 6-21

in Match-ups

Posted by: Rain.8253

Rain.8253

Xerol, my numbers are very different from yours. would you mind posting the code you used to calculate base ratings and the code you used to calculate matchup ratings?

I’m thinking that either your code or my code (or possibly both) are wrong and I’d like to track down why so that we both produce similar results.

attached are the base ratings I used for my matchup calculations; they are predicted ratings based on live scores from a couple of hours ago.

-ken

Ken,

My numbers match yours. Just in case you were worried about a mistake in your program.

~Rain

Potential Matchups 6-21

in Match-ups

Posted by: Snowreap.5174

Snowreap.5174

I’ll just explain the process.

I used numbers from millenium with my single match calculator (which shows the new deviation/volatility scores) to get the new ratings and deviation for the next week.

oh, http://xerol.org/gw2/what-if.html is yours? that’s an excellent tool and I used it to debug my own rating calculator (the detailed breakdown really helped me a lot). since your calculated ratings match mine (and we both match mos.millenium.org) I’m pretty sure we’re all doing that part right.

ArenaNet posted the new matchup algorithm here:
https://www.guildwars2.com/en/news/big-changes-coming-to-wvw-matchups/

In that post they say that the ‘random factor’ is 40.0 + 1.0 * deviation. those parameters (40.0, 1.0) are what I’m using since I haven’t seen a post from them saying they’ve changed them.

I plugged in the values you used (0.0, 1.0) into my program and I got very different results, so I think that probably explains some of the difference, but not all of it.

Fundamentally, the idea of using bit fields sounds fine to me and I don’t see any problem with that approach. however, I suspect you may have another minor error hiding somewhere in there. the situation with Vabbi and FoW in EU is a good example.

Vabbi and FoW both have very low ratings compared to the other EU servers. they are so low, in fact, that I don’t think there’s any possible way for Vabbi and FoW to avoid being in the same matchup together. In order for Vabbi to avoid facing FoW, FoW needs a really high roll and at least 2 servers above FoW need a really low roll. if 2 servers can come in ‘under’ FoW, then FoW will play in EU T8 and Vabbi will play those 2 other servers in EU T9.

If FoW rolls a +1, their match rating will be 690.284 + 1.0 x (40 + 1.0 × 219.942) = 950.226.

If Blacktide rolls a -1, their match rating will be 1104.519 – 1.0 x (40 + 1.0 × 231.990) = 832.529

If Whiteside Ridge rolls a -1, their match rating will be 1190.501 – 1.0 x (40 + 1.0 × 184.883) = 965.618

and there’s the problem. it’s possible for Blacktide to roll below FoW, but it’s not possible for Whiteside Ridge. 965.618 > 950.226. even using (0.0, 1.0) to determine matchups as you were doing before would yield:
FoW: 690.284 + 219.942 = 910.226
WsR: 1190.501 – 184.883 = 925.618
and 925.618 > 910.226, so again WsR cannot come in ‘under’ FoW.

what this tells me is that both Vabbi and FoW must always play in tier 9 — there is no way for either of them to ever get a tier 8 match. for FoW to get a tier 8 match, it must roll high enough, and 2 other servers must roll low enough, that those other 2 end up in tier 9 pushing FoW up to tier 8. with the current ratings, it’s possible for 1 server to undercut FoW, but it’s not possible for 2 servers to do so. and if FoW can’t get into tier 8, there’s certainly no way Vabbi can either.

since your results show FoW playing in tier 8 sometimes, I think you have an error hiding in there somewhere. I think your basic structure and reasoning are sound so it should simply be a matter of finding the error and fixing it — there should be no need to start from scratch (unless you want to).

incidentally, the program I use doesn’t use much memory, I simply keep a three-dimensional array with dimensions (24,24,24) for NA and (27,27,27) for EU and each array element (i,j,k) is a 64-bit integer that tallies up the number of matches where servers i, j and k played each other. this means that the number of runs is limited only by how long I’m willing to wait — 10 billion runs (or more) are reasonable if I don’t mind waiting overnight (2 million NA runs takes about 10 seconds). the penalty of doing it this way is that I don’t track rankings (Mag+DB+Kaineng playing in tier 3 would be recorded exactly the same way as Mag+DB+Kaineng playing in tier 4, making my method tier agnostic) so there’s no way for me to produce the graphs you can.

-ken

The Purge [PURG] – Ehmry Bay

(edited by Snowreap.5174)

Potential Matchups 6-21

in Match-ups

Posted by: Murderous Clown.9723

Murderous Clown.9723

Could Arborstone not sneak in under FoW since it’s deviation is so much higher than WSR’s?

Jimibabob – Valkyries of Dwayna [VoD]
Piken Square

Potential Matchups 6-21

in Match-ups

Posted by: Snowreap.5174

Snowreap.5174

Arborstone has a deviation that’s +20 over WSR’s, but their rating is +34 higher so that cancels out. Arborstone can’t get a match rating any lower than 981.401, which is still higher than 950.226.

-ken

The Purge [PURG] – Ehmry Bay

Potential Matchups 6-21

in Match-ups

Posted by: Murderous Clown.9723

Murderous Clown.9723

I have no reason to contest your calculations since I have none of my own. That said, I’m surprised there’s only a difference of 20 in the deviation coming off a gap of over 50 last week.

Jimibabob – Valkyries of Dwayna [VoD]
Piken Square

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

I’ll just explain the process.

I used numbers from millenium with my single match calculator (which shows the new deviation/volatility scores) to get the new ratings and deviation for the next week.

oh, http://xerol.org/gw2/what-if.html is yours? that’s an excellent tool and I used it to debug my own rating calculator (the detailed breakdown really helped me a lot). since your calculated ratings match mine (and we both match mos.millenium.org) I’m pretty sure we’re all doing that part right.

I think my tool might actually have a bug in it, the 3rd server is always off by a little bit compared to millenium, but it’s usually less than 0.01 so I haven’t worried about it. I need to update my other site to account for randomized matchups, right now you can’t plug-and-play scores since it assumes the tiers are unrandomized.

FoW actually has a small (~2.5%) chance of ending up in T8, so they’re not always paired up with Vabbi. What Vabbi rolls doesn’t matter at all for matchups (although they may flip between red and blue on occasion). FoW’s performance this week actually has an effect. Based off my earlier calculations (from scores ~9am ET today) they’ll have a rating around 690 and a deviation about 217. For them to not be matched with Vabbi, they need to roll higher than two other servers. The two most likely candidates for that are Blacktide (~1103 rating, 229 dev) and Whiteside Ridge (1190, 182).

Assume FoW rolls at the top of the roll range. This will add 257 to their randomized rating, putting them around 948. Blacktide needs to roll in the bottom 43% to get below 948, and Whiteside needs to roll…well, when I plug the numbers in, they can’t roll lower than ~967. Nor can Arborstone roll low enough. So I have to wonder where those ~2.5% of rolls that came out with FoW at the 24 seed actually came from. Maybe some anomaly from not starting with a sorted server list, although I don’t know why that would affect it. Time to run more tests…

Edit: Yep, that was it. Generating corrected results again…

(edited by Xerol.1578)

Potential Matchups 6-21

in Match-ups

Posted by: Snowreap.5174

Snowreap.5174

I was never able to get my calculated ratings to exactly match mos.millenium.org or yours either, but I assume it’s due to rounding errors (but when I was off by a lot, I knew something was wrong).

I think a big part of the problem is that ArenaNet doesn’t publish exact ratings and deviations; the numbers they publish at https://leaderboards.guildwars2.com/en/eu/wvw are all rounded to 4 decimal places and I think that rounding accounts for the differences we see. when I run trial calculations using adjusted values with 5 or 6 decimal places (all of which round to the same value ArenaNet publishes) I get very different outputs, so clearly those lost decimal places make a difference.

-ken

The Purge [PURG] – Ehmry Bay

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

I was never able to get my calculated ratings to match mos.millenium.org or yours either, but I assume it’s due to rounding errors. I think a big part of the problem is that ArenaNet doesn’t publish exact ratings and deviations; the numbers they publish at https://leaderboards.guildwars2.com/en/eu/wvw are all rounded to 4 decimal places and I think that rounding accounts for the differences we see.

-ken

Mine were matching anet’s numbers to within 0.005 when they only had the 3 decimal places posted, and 0.0005 with 4. Millenium, prior to the leaderboards/API going up, actually had accumulated quite some error for a while by recycling their own numbers week after week and not correlating them with the officially posted numbers. I also checked my numbers against some confirmed working general glicko calculators, and they came out within floating point error.

Most important thing is making sure you have the order of operations correct, that caused me a ton of problems when first putting mine together, which is why I broke it down into so many steps. It could be broken down even further, although combining or breaking up different operations might affect precision on the very low end, but when you’re doing as many operations as glicko2 requires, those errors add up quick.

Potential Matchups 6-21

in Match-ups

Posted by: wads.5730

wads.5730

you’d get much better confidence in errors if you used a markov chain method

Potential Matchups 6-21

in Match-ups

Posted by: Hematuria.4051

Hematuria.4051

Anet needs to let ET and FC beat on each other and give some random server a win/bye.

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

I will run the numbers for EU again about 1 hour before reset, and NA about 2 hours before reset.

Potential Matchups 6-21

in Match-ups

Posted by: Snowreap.5174

Snowreap.5174

there is still some disparity in the numbers I’m trying to figure out.

also, I’ve noticed in many of the graphs a ‘notch’ where a specific central ranking is low but the ranks to both sides are high. the rank just to the right is often especially high. for example, on the graph for Kaineng, I’m trying to understand why rank 10 is so unlikely for them compared to rank 9 or (especially) rank 11. right now I suspect that this notch is an artifact caused by the fact that rankings aren’t independent — the ranking you get depends a great deal on what rankings other servers get, and the rankings that other servers get aren’t evenly distributed; they are skewed based on rating differences. but I won’t know for sure until I’m able to run these kinds of simulations myself, and see if I get the same results.

-ken

The Purge [PURG] – Ehmry Bay

Potential Matchups 6-21

in Match-ups

Posted by: Snowreap.5174

Snowreap.5174

So, I was able to run some ranking simulations myself. Yesterday, I was having trouble figuring out how to format the data so that I could easily import it into Excel to generate graphs, but today after a good night’s sleep, the solution was self-evident. I’ve attached the numeric results (I’m not going to make images to post; it’s too much work). these are based on 10 million trials each for EU and NA.

I’ve noticed is that I’m not getting nearly as many ‘notches’ in the data. In most cases the rank probabilities monotonically rise to a peak, then decline. The exceptions are:

EU Fort Ranik [FR] has a dip spanning ranks 17 through 19
EU Underworld has a dip spanning ranks 17 through 19
EU Gunnar’s Hold has a dip spanning ranks 17 through 19
EU Ruins of Surmia has a dip at rank 18
NA Sea of Sorrows has a dip at rank 8
NA Maguuma has a dip spanning ranks 8 and 9
NA Darkhaven has a dip spanning ranks 17 and 18
NA Isle of Janthir has a dip spanning ranks 17 and 18

In particular, I don’t show any dip at all for Kaineng at rank 10. Ehmry Bay is another example where my graph shows a significant difference from yours.

I don’t think our results should differ by this much.

-ken

The Purge [PURG] – Ehmry Bay

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

I had a typo in one of my sort functions that was causing the sorting (and thus estimated ranking) to be off by 1 in about half of the cases. Pretty sure I’ve got all the bugs worked out now, doing new EU guesstimates now.

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

These are looking a lot more reasonable. I’ll re-run NA about an hour before NA reset.

Attachments:

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

Here is a corrected NA, using fairly recent numbers. I don’t think they’ll move much between now and reset (and I’d rather be playing before reset than crunching numbers anyway).

Attachments:

Potential Matchups 6-21

in Match-ups

Posted by: Shads.9468

Shads.9468

DB at the highest % chance of meeting again… aww man.. I really must have cheesed someone off in a previous life

Potential Matchups 6-21

in Match-ups

Posted by: Snowreap.5174

Snowreap.5174

these new graphs are very illuminating. they explain why we are seeing more complaints about lopsided matchups from EU servers, as compared to NA servers.

in NA, the likely rankings for each server are concentrated in smaller spans. in EU, the middle servers have very wide spreads of possible rankings, so they are more likely to find themselves all over the place in potential matchups.

-ken

The Purge [PURG] – Ehmry Bay

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

I think, by this point, you can’t really compare ratings between NA and EU – each is its own closed system, so a 200 point spread in NA might be a “tier-level” difference in skill and/or coverage, while 100 points in EU might represent the same. EU servers have higher deviation overall as well, meaning it contributes even more to the spread of potential matchups.

That’s not to say NA can’t have its share of extremely lopsided matchups, here’s two that came up for Maguuma:

0.30036% – 8260/2750000
Tier 1
Sanctum of Rall
Blackgate
Maguuma

0.15062% – 4142/2750000
Tier 5
Maguuma
Borlis Pass
Anvil Rock

Also don’t take the “top match” too seriously, Maguuma’s most likely match is still only sitting at a 4% chance, so there’s a 96% chance we DON’T get that matchup.

Potential Matchups 6-21

in Match-ups

Posted by: Shads.9468

Shads.9468

Also don’t take the “top match” too seriously, Maguuma’s most likely match is still only sitting at a 4% chance, so there’s a 96% chance we DON’T get that matchup.

The last 2 matchups have proven that the dice are out to get me.

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

So far I’ve been using a flat (uniform) distribution for random numbers. I’m assuming this is what ArenaNet is using, although they haven’t specified.

What if they used a normal distribution, though? I ran two tests, one where the server deviation represents 3 Standard Deviations of difference, and one where it represents 2 SDs. The 3 SD case actually represents a lower spread, basically 99.7% of the random rolls stay within +/- deviation, whereas with the 2SD case 95% stay within +/- deviation. The main difference is 65% of the rolls are within 1/3 or 1/2 of the server deviation in each case – much more centrally clustered. Taking the server deviation as 4 or 5 SDs would make it even more centralized, with the downside of a lot less variation in matchups (in extreme cases, especially as server deviation trends downward, almost all matchups being exactly what they would be before randomization).

The other thing about doing it this way is that, in theory, any server can roll any number, but it’s very very unlikely. (Edit: Made a 4 SD example as well.)

I think taking the server deviation as either 3 or 4 Standard Deviations (or somewhere in between, it could be fine-tuned and doesn’t need to be an integer) and using a normalized random variable would work at lot better. The change on the server-side would simply need to be changing the random function from [-1..1] to a function that produces a normal number in terms of standard deviations.

Attachments:

Potential Matchups 6-21

in Match-ups

Posted by: Chris.3290

Chris.3290

ArenaNet, if you are reading, please manually adjust the numbers or turn the thing off for a week.

On behalf of EVERYBODY (who’s not on EB :P )

Potential Matchups 6-21

in Match-ups

Posted by: saiyr.3071

saiyr.3071

DB at the highest % chance of meeting again… aww man.. I really must have cheesed someone off in a previous life

Oh Bonnie, don’t be silly.

You probably cheesed everyone off in your previous life.

[DERP] Saiyr, “bff” of Sgt Killjoy

Potential Matchups 6-21

in Match-ups

Posted by: Xerol.1578

Xerol.1578

Reading up on glicko again, the deviation is supposed to represent ONE standard deviation. Here’s what that looks like:

Attachments: