Douglas Modern Chess, Season 2 Semifinal

Herewith the results of the semifinal, which was between Stockfish and three derivatives. I was actually expecting the the Raubfisch variants to come out on top, based on their performance in the heats, but it was not to be. Perhaps the longer time control affected things.

The results revealed a flaw in my own scoring system, which was supposed to prevent confusion about results, as shown below. First and fourth are mostly clear, but the 2nd and 3rd spots are more problematic. So I need to rethink my scoring before deciding who makes it into the finals.

Summary:

Games: 12; Draws: 9, DrawPercentage: 75 %
Whitewins: 0; Blackwins: 3, Draws: 9

Longer time control, and stronger engines, means more draws. Curious that white was unable to win.

Conventional scoring:

Raubfisch X41d3._sl         : 4
Zeus 4.1.7 M                : 3
Stockfish 11                : 3
Raubfisch_ME262_GTZ20d3._sl : 2

Results table:

Engine                 Win     Draw    Lose
Raubfisch X41d3._sl    2 [0/2] 4 [3/1] 0 [0/0]
Stockfish 11           1 [0/1] 4 [2/2] 1 [1/0]
Zeus 4.1.7 M           0 [0/0] 6 [3/3] 0 [0/0]
Raubfisch_ME262_GTZ2   0 [0/0] 4 [1/3] 2 [2/0]

My scoring system which takes black/white and number of moves into account:

Zeus 4.1.7 M                : 14.82
Raubfisch X41d3._sl         : 11.61
Raubfisch_ME262_GTZ20d3._sl : 6.07
Stockfish 11                : 5.94

The problem with these scores is that an engine that failed to win, despite never losing, should not rank higher than an engine that did win (twice) as well as never losing. Hence I need to rethink.

My points-based scoring system, which takes black/white into account:

Engine                        Points  Percentage
Raubfisch X41d3._sl         : 202   : 67.33 %
Stockfish 11                : 152   : 50.67 %
Zeus 4.1.7 M                : 150   : 50 %
Raubfisch_ME262_GTZ20d3._sl : 102   : 34 %

These scores are better.

Cutechess scoring:

Rank Name                         Elo +/- Games Score Draws
1    Raubfisch X41d3._sl          120 162 6     66.7% 66.7%
2    Zeus 4.1.7 M                   0   0 6     50.0% 100.0%
3    Stockfish 11                   0 173 6     50.0% 66.7%
4    Raubfisch_ME262_GTZ20d3._sl -120 162 6     33.3% 66.7%

Cutechess also appears to rank Zeus above Stockfish, but it may just be sorting alphabetically based on score, without taking anything else into account.

So you can see the different scoring systems produce conflicting results, which I need to resolve before running the final.

Here are the games themselves. Time control was 20 minutes plus 20 seconds per move.

The only mate was between Raubfisch X41d3._sl and Zeus 4.1.7 M, the rest were decided by adjudication.

Stockfish 11 vs Zeus 4.1.7 M

Zeus 4.1.7 M vs Stockfish 11

Raubfisch X41d3._sl vs Raubfisch_ME262_GTZ20d3._sl

Raubfisch_ME262_GTZ20d3._sl vs Raubfisch X41d3._sl

Stockfish 11 vs Raubfisch_ME262_GTZ20d3._sl

Raubfisch_ME262_GTZ20d3._sl vs Stockfish 11

Zeus 4.1.7 M vs Raubfisch X41d3._sl

Raubfisch X41d3._sl vs Zeus 4.1.7 M

Stockfish 11 vs Raubfisch X41d3._sl

Raubfisch X41d3._sl vs Stockfish 11

Raubfisch_ME262_GTZ20d3._sl vs Zeus 4.1.7 M

Zeus 4.1.7 M vs Raubfisch_ME262_GTZ20d3._sl

Leave a Reply

Your e-mail address will not be published. Required fields are marked *