20 November 2021

An Engine Iceberg

In the previous post, CCC C960 Blitz Championship (October 2021), I wrote,
Given that engines' evaluations for every move are available in the event's PGN game scores, perhaps there is something to be learned about the 960 different start positions. That investigation would make a good follow-up post.

Make that two good follow-up posts. The first post was on my main blog, Evaluating the Evaluations (November 2021), where I concluded,

Now that I have a tool for rapidly evaluating the engine evaluations, what can I do with it? The first task will be to put it to work on the 960 start positions used in chess960.

The second post is this one. I had already downloaded a few PGN files from recent engine vs. engine events, so the first question was which one to use. I decided to continue with the games from an event that I covered earlier this year in another post on this blog, TCEC C960 FRC3 (March 2021). At that time I noted,

Except for an occasional CCRL game, I can't remember ever looking at an engine vs. engine chess960 game. Is there anything to be learned from such an exercise, or is the play of the engines beyond comprehension?

TCEC FRC3 was a 50 game match won by KomodoDragon over Stockfish on a final score of +2-1=47. The seven mandatory tags in the PGN header for the first game look like this:-

[Event "TCEC Season 20 - FRC3 Final"]
[Site "https://tcec-chess.com"]
[Date "2021.03.14"]
[Round "1.1"]
[White "KomodoDragon 2671.00"]
[Black "Stockfish 20210226"]
[Result "1/2-1/2"]

I loaded the file into my database, added the concept of SP, and produced the following chart. It covers the first 22 games of the match. Each start position (SP) was played twice, where KomodoDragon always had White in odd-numbered games. In a match between humans, this pattern would risk giving an advantage to one of the players, but in games between engines, it's harmless.

The last two columns show the first move, as chosen by White, and the value ('wv') calculated by the engine for that move. I could have also shown the principal variation ('pv') calculated by White, but that wouldn't add much to an initial understanding of the data. The same data is available from the PGN file for all moves by both sides in a game.

Since the data covers only the first move of 25 SPs (50 games) out of the full set of 960 SPs, it's obviously just scratching the surface. Suppose we had data for the first few moves of all 960 SPs from many different engines played over a long period of time. What might we learn from this? I would want an answer to that question before spending too much effort collecting more data.

No comments: