Does this invalidate the entire experiment? Well, read on and decide. Annoyed, I logged into an account I haven't played TF2 on in years. This account had absolutely zero games in matchmaking; I played on it prior to TF2 having any matchmaking whatever. Thus, Glicko had nothing to work with. Game 1 was entirely normal. We had 2 people leave on each team, resulting in a 6v6 that felt fair, even if I happened to lose. Notice the positive K: Game 2 looked like this.

Not exactly a stomp, but I was shocked at how immediately the very next game resembled the unbalanced ones I queued into on my main. D is still positive, though nowhere near as good as 1. I might have managed to get my K: D up in the most underhanded way possible for the good of the experiment, I promise , but the players I was expected to carry became noticeably worse. The disparity in skill was massive, far more than Game 2. Game 4 is where things took a head.

With a very high K: D in Game 3 despite losing , matchmaking placed me against excellent and coordinated players. Yet I wasn't the best player in Glicko's eyes; rather it was an ESEA pro's alt, who was clearly expected to carry my useless ass alongside everybody else. Needless to say, it was a one-sided unfun stomp. Game 5 was interesting. Half of the players on my team appeared to be roughly the same skill level, yet the opposing team still massively outplayed us. I theorize my loss streak result convinced the Glicko algorithm to try and "balance" things out which failed, but it tried.

Game 6 continued my impressive loss streak, where 3 competent players were pitted against me. Any semblance of teammates equal to me went out the window.

Game 7 was a stomp that ended about 60 seconds after I joined. Perhaps as a direct result of my prior 6-loss streak, the game decided I needed a win at any cost. My findings based on this initial phase were summarized in the following way:.

Account 1 played 5 games. One felt balanced and enjoyable. Four felt like I was expected to carry. Account 2 played 7 games. One felt like a better player was forced to carry me. Finally, one felt like I was handed a free win. The third part of the experiment set out to solve another question: What does this matchmaking look like from the perspective of a totally hopeless player?

To answer this, I made a brand new account with which I would strive to play as badly as possible.

  • I played on it using a VPN with reasonable in-game ping and a virtual machine to keep it from being associated with my other accounts. It also, obviously, had zero experience with TF2 matchmaking or their ranking methodology. In-game, I fired around players, engaging them sometimes hitting them but never getting kills if I could help it. I also didn't just run into pits and die; instead, I allowed other players to kill me.

    The goal was to act like I was simply a bad worse player, not actively trying to kill myself or grief. Game 1 had me serve as backfill in a game we won in about thirty seconds. I got zero kills, I didn't die, I didn't even fire a weapon. Nonetheless, my new account's winrate is now 1: Game 2 placed me in a relatively balanced game, even if it like Game 1 favored my side. This was the very first match where I played like my worst nightmare: Game 3 confirmed my biases by placing me on a team where a good player was expected to carry me.

    I remained worthless in-game but, nonetheless, we won. L ratio is now 3: Game 4 queued me into a match about to be won. I hate to be redundant, so I'll let that big pink text say all that needs to be said. Game 5 placed me into a game that just began, thankfully. This one resembled Game 3, except the player charged with carrying me sensibly refused the duty and we lost horribly. Still useless, my W: L ratio is now 4: Game 6 was a little more balanced , but not by much.

    One player was still expected to carry less-skilled ones, and I was placed on a team where the likelihood of me losing was very small. Note my horrible K: D, and my W: L ratio of 5: Game 7 seemed to take my W: L ratio into account, despite my total lack of in-game ability. I lost, leaving me with a 5: What a great first impression for new players.

    Am I a bad scientist? Game 9 resembled Game 6 and Game 7; I was placed on a team where the likelihood of me losing was remarkably low , at the expense of another player clearly expected to carry his teammates alone. Oh, and I'm 7: Game 10 recognized I'm a terrible player, but also that my W: L was simply too high. I was thus placed on a team where 1 man was expected to carry, all while facing off against a crew of similarly-skilled players.

    We lost in 2 minutes. My main account had a W: It's important to note this one has the most games by a large margin, perhaps explaining the more even result. My secondary account had a W: D was more or less similar to my main account. My third "dummy" account, on which I intentionally played like an imbecile, had a W: D so laughably bad it was practically 1: Without getting too deep into math and chasing readers away, Glicko was designed to estimate the strength of players in chess via their win-loss ratio.

    This algorithm works excellently in 1v1 games , like Street Fighter, Starcraft or obviously Chess. It does not work in a team setting, where twelve people compete against another twelve people. It can't possibly work where whether you win or lose is even partially dependent on how well your teammates support you or how well your enemies support each other. Granted, the algorithm is utilized in Formula 1 racing, where there are technically "teams" of people behind each car The pit crew isn't rated based on how fast they do their jobs and queued accordingly.

    It's strictly "where did the single entry car, racer, etc. Thus it can be surmised that what CS: L ratio, and perhaps even K: D but placed on teams according to Upon launching the game after the July 7th, Patch, players were able to pick a side in the main menu. Scoring points as any class in either Casual Mode or Competitive Mode would count towards the player's "vote", whose total accumulation represented by a percentage for each side.

    Over a year later, the Jungle Inferno Update released the new Pyro class pack.

