On the talkchess forum IM Larry Kaufmann recently posted a position and comments regarding different engine evaluations of a position in which there is a material imbalance. After the moves: 1. d4 Nf6 2. c4 g6 3. Nc3 d5 4. Nf3 Bg7 5. Qb3 dxc4 6. Qxc4 O-O 7. e4 b6 8. e5 Be6 9. exf6 Bxc4 10. fxg7 Kxg7 11. Bxc4:
We have a position with 2B’s and a N for a Q and P. Kaufmann observes that the move 7...b6 is rarely played by Grandmasters because the position is considered favorable for White. As he points out, material is deemed even but White has all the positional pluses. He did an interesting experiment where he ran this position for 30 minutes on a quad and arrived at these assessments:Deep Rybka 4: +.53
Deep Shredder 12: +.88
Fritz 12: +.47
Hiarcs 13.1: +.48
Komodo 1.3: +.30
Naum 4: +1.05
Stockfish 2.0: +.20
Firebird 1.31: +.09
Rybka 2.3.2 a MP: +0.02
Robbolito .085g3: 0.00
Ivanhoe 47 and 49: 0.00
Houdini 1.5: -0.13
Critter 0.90: -0.17
He points out: “Rybka 2.3.2a MP got this quite wrong with a nearly zero score, and Robbolito, which is said to have come from decompiled Rybka 2.3.2a code, also makes the same mistake with a zero score. Of course the scores won't be identical as the searches are different. The engines acknowledged to come from Robbo have of course also a zero or near-zero score. Houdini and Critter actually go negative; it is hard to imagine that a program not starting with the Robbo values would make such a big error in evaluating this position. I don't know much about Critter so I don't mean to start a debate about its status, but this is certainly strange.
As for why Rybka 2.3.2a gets it wrong, all attempts to fix the undervaluation of minor pieces vs. major pieces tested poorly in Rybka, yet seem to test okay in unrelated engines. So any program that makes this same mistake is likely to either have copied Rybka values, or to be so similar to Rybka that testing produced the same anomalous result.”
CC GM Marjan Semrl states that a Master will begin his analysis by examining the position and its properties very carefully and will take into consideration the advice of the program, but he will also advise the program and force it to analyze the moves he thinks are best.
Once again, we see that you simply cannot rely on engine analysis to tell you what the best move in any given position is unless you are talking about a tactical situation. As I have pointed out in previous posts, this seems especially true in positions in which there is a material imbalance. What this means is don't place blind faith in an engine's numerical evaluation if the position is primarily one that requires positional judgment.
All that said, any engine move is likely to be better than the ones you and I select because most of us are't quite as good as GM's when it comes to positional judgment.