Random Posts

Tuesday, September 24, 2019

You Need A Games Database

     A database is just a collection of chess games and it is important to know the strength of the players whose games are included in your database.
     A lot of databases (along with their opening statistics) are skewed because they contain games played by, for example, a couple of players rated under 1100 in the Antarctica Under-10 Championship.
     For example, Chess Assistant will give you a report on evaluations, full stats and previously played lines, but you need to take into consideration several factors before deciding on a particular move such as:

* How does each move fare result-wise
* What was the year the move was played
* What is the move’s performance rating, etc. 

     So, that’s why it’s important to know the strength of the players whose games those stats are based on. Also, when studying openings it can be very useful to have the complete game just to give you an idea of how to follow up and this is something a lot of opening books don't do. Here too you want that to be based on something other than the play of lower rated players. 
     It is very important to realize that databases don't tell you which move is correct, only which moves been played and are popular. A line might appear in a hundred games and be successful most of the time, but it may have been refuted and the refutation only appears in a handful of games...maybe even only one. 
     Some things that have to be considered are: 

Sample Size – the number of games may be too small too draw any serious conclusions. 
Year Played - theoretical advances in recent years can mean stats are useless if a lot of the games were played back when Bobby Fischer was a kid. 
Ratings – as already mentioned, it is important to consider the strength of the players. Even if a move might be good, if a garden variety master plays it against a GM the statistics will show up as unfavorable. Or, the database may contain a lot of games with an opening that’s favored by lower rated players but it's not one that a GM would be caught dead playing. That’ll mean the stats for that opening are meaningless. 
     Size – Big isn’t always better. Better to have a million GM games than 8 million with 7 million games played by non-masters. 

     Keeping a database up to date is a never ending chore, but it’s something that should be done regularly. It’s nice to have several hundred thousand games played from way back, but it’s more important to have recent ones if you’re a serious player. 
     Chessbase is what many professionals use, but their Mega Database isn't cheap. It’s got 72,000 annotated games and contains more than 7.6 millions games from 1560 to 2018. It’s over $170 though. 
     One of my favorites is KingBase. It’s a free games database, updated monthly and you can download games in PGN, SCID or CBV format. The games are mainly collected from the websites of various tournaments and TWIC archives so they have done a lot of the work for you. 

What you get: 
KingBase has no annotated games, no games with less than 6 moves, no games before 1990 and no games with players rated below 2000. The files are available via direct download or BitTorrent. There’s also a KingBase Lite with over a million games. 

KingBase 2019 has around 2.2 million modern games (i.e. played after 1990) and monthly updates are available for download...currently over 81,000 games. The only downside is that the updates are in pgn format so they may have to be converted to another format, depending on what program you are using.

     Other valuable sites for games are The Week In Chess and Lars Balzer’s site. Also, if you’re into making an opening book these games would be a good choice.  For more information on making your own opening books see: 

Making a Correspondence Opening Book 
Beware of Chess Engine Opening Books

No comments:

Post a Comment