I have the results of matches between players for a game (like chess, or go, or MTG). How do I turn that into Elo ratings?

One popular tool people use for this is bayeselo (link). It’s tuned to work with chess, but can be used for other programs as well – the AlphaGo Zero paper uses it to compute ratings between its computerized agents, as do people in the Starcraft AI community.

Unfortunately, getting it to work for my use case was somewhat painful. I’m surprised there isn’t more documentation on the internet about how to use bayeselo. So this short post serves as a guide. I hope it’s helpful!

A minimal PGN file

The bayeselo program only reads Portable Game Notation (PGN), which is a text-based file format designed to store chess matches. As such, there are a bunch of chess-specific elements in PGN that you don’t really need if you don’t care about the matches and only care about computing a rating.

With a little bit of trial and error, I figured out the minimal PGN file that can be parsed by bayeselo.

To read this using the bayeselo program, do something like this:

(base) ➜  BayesElo ./bayeselo
version 0057, Copyright (C) 1997-2010 Remi Coulom.
compiled Apr  6 2019 16:49:26.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under the terms and conditions of the GNU General Public License.
See http://www.gnu.org/copyleft/gpl.html for details.
ResultSet>readpgn minimal.pgn
2 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>

First mover’s advantage

In many turn-based games (like chess and go), the first player to move has an advantage. The size of the advantage varies based on the game, and for some games the advantage might be zero (for example, there is no advantage in Pokémon because both players select their moves simultaneously).

This concept is modeled in bayeselo by assigning an Elo score for the first mover advantage. Similarly, you can assign an Elo score for a draw in bayeselo. How to do this:

ResultSet>readpgn out.pgn
4000 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>elo
ResultSet-EloRating>advantage 0 ; set elo for first mover to zero
0
ResultSet-EloRating>drawelo 0   ; set elo for draw to zero
0
ResultSet-EloRating>mm
00:00:00,00
ResultSet-EloRating>ratings
Rank Name   Elo    +    - games score oppo. draws
   1 p1     423   29   29  1000   83%   150    0%
   2 p2     150   19   19  3000   65%   -50    0%
   3 p3     -87   22   22  2000   56%  -168    0%
   4 p4    -487   33   33  2000    6%    32    0%
ResultSet-EloRating>

Outro

If there’s anything else I learn about using bayeselo, I’ll add it here. Feel free to reach out if you have any questions or suggestions!