Random Posts

Saturday, May 2, 2026

Comparing Evaluations

 
    
My curiosity has been piqued by the Weighted Error Value and Blunder Category in the Tactical Analysis summery in Fritz 19 and 20. I have been unable to locate any information on the Blunder Category so have no idea what constitutes the difference between the categories. 
    The Weighted Error Value measures a player's accuracy by comparing the moves played to the engine’s moves and it’s claimed to be a more accurate way of measuring performance. 
    The value represents the average amount of centipawns lost per move compared to the engine's top choice. The WEV ignores moves like forced recaptures, moving out of check and moves played in the opening. This prevents those moves from inflating a player's accuracy. That makes sense. If a move is forced, for example, you shouldn’t get credit for making a move you had to play. Also, WEV weighs errors so that a single massive blunder (like losing a Queen) is reflected as more significant than several minor inaccuracies. The lower the WEV, the more accurate the player’s moves were. Thus, 0.00 means the player’s moves matched the engine’s. 
    In order to see how this WEV and Blunder Category work out in actual analysis I tested six engines, three of the top engines and three lower rated ones. I expected to see lower WEVs with the weaker engines thinking that a player’s moves would have a better chance of matching those of the weaker engines. I also added a category that I designated Good Moves less Bad Moves. 
    I am not sure what to make of this little experiment except that as beautiful as this game is to we humans engine evaluations of Anderssen’s play isn’t that impressive! 
    Further, there doesn’t seem to be that much difference in the WEV of all the engines. And, as for the Blunder Categories, they are all over the place and seem to be of little practical value. 
    The evaluations spit out by the old Deep Fritz 14 engine (released in late 2013) was just plain wacky; it makes it look like Kieseritsly played better except for the two missed wins. 
 
    The game I selected is the Immortal Game between Anderssen and Kieseritsky from the 1851 London tournament. Anderssen sacrifices all of his heavy pieces and more to deliver mate. The above charts show the WEV and the Blunder scores and you can draw your own conclusion as to how valuable they might be. 

No comments:

Post a Comment