18 March 2023

Myth No.6 - 'Forced Wins for White'

Upon encountering chess960 for the first time, one of the first questions a new player asks is 'Are all 960 positions fair?'. I included a statement of this concern in Top 10 Myths About Chess960 (May 2012), where one bullet said,
Some start positions are forced wins for White

Remembering that I wrote this more than 10 years ago, at a time when I wasn't absolutely 100% sure that such unfair positions didn't exist, my standard response to the statement was, 'Which positions are forced wins? Please provide a specific example'. I never received a single example. Ten years later I can say with more confidence -- although still not 'absolutely 100% sure' -- that while some positions are difficult for Black to play, none of the 960 positions is lost before a single move is made.

In January a new study titled, Analyzing Chess960 Data | Alex Molas | Towards Data Science (towardsdatascience.com), appeared. Its subtitle announced,

Using more than 14 million chess960 games to find if there’s a variation that's better than the others.

There is considerable knowledge presented in the study and I don't pretend to understand all of it. I might well need several posts to unravel its subtleties, so I'll start by summarizing its references; in the following discussion, '>>>' means a direct quote from 'Analyzing Chess960 Data'.

>>> 'The original post was published here...'

[NB: I'll come back to this reference later; see '(A)' below. First I need to point out that there's an important issue with terminology. When chess players use the term 'variation', they mean a sequence of play arising from a specific position; e.g. 'In this position I had two variations and I had to work out which variation was better for me.' • In the Molas study, I'm convinced that the word 'variation' refers to one of the well-defined 960 start positions that are legal for chess960. I read the subtitle of the towardsdatascience.com article as 'to find if there’s a *start position* that’s better than the others' and the title of the amolas.dev post as saying 'Discovering the best chess960 *start position*'. I won't repeat this caveat each time, but it's important and helps to understand the discussion.]

>>> 'Ryan Wiley wrote this blog post where he analyzes some data from lichess..'

>>> 'There’s also this repo with the statistics for 4.5 millions games (~4500 games per variation)...'

[NB: There's an issue with the word 'variant' here, but it's not as important as the previous 'NB'. Chess960 purists will know what I'm talking about.]

>>> 'In this spreadsheet there’s the Stockfish evaluation at depth ~40 for all the starting positions...'

>>> 'There’s also this database with Chess960 games between different computer engines. However, I’m currently only interested in analyzing human games, so I’ll not put a lot of attention to this type of games...'

>>> 'Lichess -- the greatest chess platform out -- maintains a database with all the games that have been played in their platform...'

>>> 'To do the analysis, I downloaded ALL the available Chess960 data (up until 31–12–2022). For all the games played I extracted the variation, the players Elo and the final result...'

>>> 'The scripts and notebooks to donwload [sic] and process the data are available on this repo...'

At this point the article launches into 'Mathematical framework; 'Bayesian A/B testing; [...]'. This, of course, is the essence of the study and I won't go any further in this current post. Let's get back to '(A)', where there's another key reference.

>>> 'This post got some attention in Reddit...'

I could end the post here, but I need to make an admittedly subjective observation. There are two example of bias in the above references.

The first bias is 'I’m currently only interested in analyzing human games'. Huge caveat here. In my not-so-humble opinion, the CCRL is the best source of chess960 opening theory. Period. Full stop. The CCRL engines are rated at least 1000 points higher than most human players on Lichess. The engines don't make simple tactical errors and they calculate deeper into every position than any human can. If there is an unfair chess960 start position, the engines will find it, just like they find errors in most games played between humans.

I can understand ignoring the engines because humans grapple with different challenges in chess960 openings, but the purpose of the study was 'to find if there’s a *start position* that’s better than the others'. Ignoring the experience of the best players on the planet is severely limiting.

The second bias is 'Lichess -- the greatest chess platform out'. The main alternative here is Chess.com. Why ignore games played on the world's largest chess platform? Maybe there's a good reason, but I can't think of one. On a personal note, last year I investigated which of the two sites would be better to continue my own chess960 correspondence play. I determined that Chess.com was more serious about eliminating human players who cheat by using engines in games with other humans. Since my goal was playing no-engine games, I went with Chess.com. How much of the Lichess data involves concealed engine use?

Biases notwithstanding, the Molas study is an important step in evaluating the fairness of all 960 positions in chess960/FRC. I'm looking forward to understanding it in more depth.

No comments: