Tuesday, January 13, 2015

Mickey Mouse Equipment


What I have
    I have been analyzing several games of late using engines and keep running into positions where different engines are giving way different evaluations. Then when you actually start playing through the lines things get even more confusing because the numeric values are all over the place. 
     These games have mostly been annotated GM games from books, and when a GM says the position favors one side, or maybe even is winning, but the engines show only a nominal advantage or perhaps even prefers another move, what do you do?
     I was playing over a Zvjaginsev vs. Khalifman game from Moscow 2005 and had Khalifman played 27...f6 things would have been equal, but analysis with several engines proved to be pretty murky.
     I have a couple of correspondence games going that have been very difficult to analyze. In one game as black I played the K-Indian against a FIDE IM and gradually drifted into what appears to be a lost game without making any obvious mistakes. Of course it's also pretty well-known engines don't evaluate those types of positions very well, so I should have avoided it.
What I need
     Another game was dead equal for 30 moves and when my opponent played Stockfish's initially recommended move it turned out that two (!) moves later the evaluation jumped to a significant 1-1/3 Pawn advantage in my favor.
     In another game, I accidently played the wrong move at move 4 and the position was evaluated by Komodo 8 as one Pawn in my opponent's favor. After 50 more moves we had a position that the engines claimed were significantly in his favor, but as confirmed by the Shredder endgame database, was drawn.  In yet another game against an ICCF Senior IM the engines said my advantage was nearly a Pawn, but in the end, the game was only a draw.
     How do you explain this? I can't because I simply am not good enough to know when the engines are wrong. Their evaluation scores are an aggregate of various factors (material, space, two-bishops, open files, weak Pawns, etc.) so all factors considered, the “score” is simply an approximation of the position. It has been pointed out that Stockfish's evaluations can be erratic; in some positions it shows the same score for a long time then all of a sudden it may make a big jump. It makes good moves, but sometimes the evaluations can be whacked out. That's why you need another engine to keep an eye on SF.
     Tactically Houdini (I only have H2) is still the "go-to" engine. I run it for Shootouts when I want a quick opinion on possible outcomes. That is, if move A results in +1 -1 =1 and move B results in +0 -2 =1, I suspect move A is probably the better of the two. Supposedly Rybka is pretty good tactically, but I don't use it.
     Komodo seems to be middle of the road in that it calculates tactics well, is good in semi-closed positions and plays really good in the ending.
     Anyway, I have also noticed another strange thing...searching at higher depths doesn't necessarily mean a better evaluation because the engine may be evaluating a bad line. Just because the engine says it has searched to a 30-ply depth does NOT mean it has looked at EVERY move. What it has done is, it ignored branches that didn't look promising.  Sometimes another engine may find a win in a line another engine pruned.
     What all this means is that analyzing games for fun and trying to ferret out the best move in a GM game can be challenging, if not impossible. For example, in the Zvjaginsev vs. Khalifman game at move 31 Komodo 8 evaluates move 31.Nh6 at 0.82 while 31.Rxe3 (the move played) is evaluated 0.00.

Is it really possible that a 2600+ GM did not see the difference between the two moves or are the engines missing something? I could probably figure it out IF I had a powerful enough computer and enough gumption to spend hours analyzing, but I don't. That's why I probably would not do too well if I enter the upcoming LSS World Championship Preliminaries that I was recently informed I qualified for. A piddly little laptop and no gumption won't get you very far in correspondence chess.


1 comment: