Creating the Rustic chess engine
or, the art of Chess Programming in Rust.
Marcel Vanthoor
Latest update: March 28, 2023
Updated chapters:
Ramblings section added
Appendix - The Binary System
Appendix - Bitwise operations
Board Representation - Introduction
Board Representation - Detecting the Bishop Pair
Board Representation - Cannot force mate
Board Representation - Detecting draws by FIDE rules
Preface
Hi there. Nice of you to hang around. I'm Marcel, the author of the chess engine Rustic and the writer of this text. You're about to step into the wonderful and sometimes aggravating world of chess programming. This can be a rewarding endeavour... but maybe addictive is the better word. There will be lots of challenges, but with perseverance you'll be able to overcome them one at a time. When you reach the point where your chess engine is capable of playing its first full game against yourself or another engine, you will have created something to be proud of.
"But... wait", I hear you asking, "hasn't this chess programming thing been done before?" You're right. It has been done before; many times, in fact. If you write your own engine from scratch, you will become part of a long history and tradition which already started at the dawn of the computer age in the early 1950's. Many chess programmers came before and many will come after. Even though hundreds of engines already exist at this time of writing in 2021, there is still merit in writing your own. The reasons why people do this are extensive. Some of those reasons can be:
- Studying computer science concepts.
- Learning a new programming language.
- To have their own engine to play against.
- As a prelude to get into machine learning.
- To compete with other engines and authors.
- And many others.
During the design and writing of a chess engine (or any non-trivial software project for that matter) there is always something new to learn, one more feature to add, or another bug to fix. Some engines have been in development for 25 years or more and they are still improving in one way or another. This does not have to be a project you just start, plan, create and finish like any other; it can be a journey and a hobby that may last for decades, if you want it to.
Welcome on the long road towards writing your very own chess engine.
Good luck and have fun,
Marcel
What this book is
This book is the documentation for the chess engine Rustic. It describes the design, all of the parts, how they work and how they fit together. The chess engine is written in Rust, so this is the programming language which will be used in this book. Code and examples will be taken directly from a working release of the engine. This means the book contains information about the following:
- How to design a chess engine
- Software engineering principles
- Chess engine programming and algorithms
- Code examples taken straight from the engine
- Chess knowledge and how to use this in an engine
- The Rust programming language, where necessary
- Pointers on how to get the code to compile
- Testing methodology
When designing and writing your own engine, either in Rust or a different programming language, this book provides a way (not the way, because there are many ways to create software) on how to accomplish this. All of the information has been gathered throughout the internet:
- Websites
- YouTube video's
- Forum posts
- Newsgroup posts written 25 years ago
- Computer science blogs and articles
- And my own knowledge as a software engineer
Listing all of the sources I used would be quite impossible. This would become a gargantuan list, not even taking into account the fact that internet is volatile. A source found in 2019 may not be available in 2021, or the information may have changed. Some people who are mentioned in the credits of Rustic and on the "Further study" page also provide lots of information on chess programming. You may want to check out their work.
What this book is not
If you are hoping for a step-by-step tutorial, you are probably going to be disappointed. This book is not laid out in such a way that you can just follow it from beginning to end and end up with a perfectly working chess engine. This would maybe be possible if you are using Rust, but when using other programming languages different design decisions may need to be made and implementation details may need to be changed. This is because not all languages support the same features and similar features may not work the same way.
The goal of the book is to consolidate enough information in one place to be able to design and write a chess engine from scratch in any modern programming language. The book should provide you with a good foundation in chess engine design and programming. You will be able to research and implement ideas that are not (yet) described in these pages... or even, try and implement your own ideas.
Because I'm not the most evil guy ever, I'll help you along in case you were looking for a step-by-step tutorial. Please note that the links below were available in November 2021; this may have changed since then.
Step-by-step tutorials:
- Programming a chess engine in C (Richard Allbert / Bluefever
Software)
- Bitboard chess engine in C (Maksim
Korzh)
- Chess engine in Java (Software Architecture &
Design)
- Java chess engine tutorial (Logic Crazy)
How to use this book
The best way to use this book is to read it once from beginning to end, to get a general overview about the topics involved in creating a chess engine. It does not matter if you don't immediately understand everything. After you read the book, you can start designing and writing your own engine. This can be as simple as setting up the main() function, defining two constants for your engine's name and version number and printing those to the screen. At that point your engine has reached the "Hello, world!" stage and you go from there. You can refer back to this book time and again when you reach each new topic.
There is also a "Further study" page in the book, referring to some good material to review. Parts of the tutorials above are no exception; even if you don't want to create a chess engine by following a tutorial, those videos can still be of help. Just find the ones about the topic you are researching in the playlist.
About Rustic
The chess engine discussed in this book is called Rustic. It is written in the Rust programming language. The engine is an original work, but it implements many well-known concepts, both from computer science in general, and chess programming in particular.
Rustic is an open-source engine licensed under GPL v3, so anyone can download the source code and change it to their liking, and compile it for their particular platform or CPU. Creating derivative works is possible, within the rules stated by GPL v3.
The feature-set for Rustic Alpha 3.0.0 is:
- Engine:
- Bitboard board representation
- Fancy Magic bitboard move generator
- Transposition Table
- UCI-protocol
- Search
- Alpha/Beta search
- Quiescence search
- Check extension
- PVS
- Move ordering
- TT Move priority
- MVV-LVA
- Killer moves
- Evaluation
- Material counting
- Piece-Square Tables
Obviously the feature list will grow longer as the engine gets developed further. The book will describe how each feature works and how it is implemented. Example code will be in Rust, coming straight from a working version of the engine, but there will be enough explanation so that each feature can be rewritten in any desired programming language.
The programming language obviously needs the right capabilities, so please note that something like WhiteSpace is probably out. Even if the language you choose hase enough capabilities to write a non-trivial project in it, the language still may not be a good fit. The programming language choice is your own responsibility. Any and all suffering will be of your own making. Choose your language wisely.
Downloads
In the list below you'll be able to download pre-built binaries for each released version of Rustic. These binaries are hosted on Github. If you would like to download the source code or clone the repository for a specific version, you can click on the Github link to visit that version's release page. Versions that have been tested by the CCRL list have a clickable Elo rating, linking to their CCRL page.
Rustic Alpha 3.0.3 | 2.2 | 1.3 | March 28, 2023 | No executable 2 | Github
Rustic Alpha 3.0.2 | 2.1 | 1.2 | June 11, 2022 | No executable 2 | Github
Rustic Alpha 3.0.1 | November 6, 2021 | No executable 1 | Github | 1867 Elo
Rustic Alpha 3 | June 18, 2021 | Download | Github | 1867 Elo
Rustic Alpha 2 | March 17, 2021 | Download | Github | 1815 Elo
Rustic Alpha 1.1 | March 15, 2021 | Download | Github | 1675 Elo
Rustic Alpha 1 | January 24, 2021 | Download | Github | 1675 Elo
1 This update fixes a broken compilation of the optional "extra" module. There have been no changes in the chess playing part of the engine. You can download the executables for version Rustic Alpha 3.0.0.
2 These are maintenance updates for existing versions of Rustic that update a library to the latest version, or change the code to compile cleanly with a newer version of Rust. These versions have no new executable. The existing ones can be used for testing purposes.
Prerequisites
There's a few more things we need to discuss before you can start on this journey. Those are the... dreaded... prerequisites. Yeah, I don't like them either, but they're really important to get right so you can avoid disappointments during this project.
This is gonna take a while
First: Let's set some expectations first. This is not a weekend project (for most people) if you want to do it well. Writing a strong chess engine takes time; not only for writing it, but also researching, debugging, and testing. Be prepared: you're going to be at this for some time before you see the first results. That would be having your engine play a non-trivial game against another engine, and win it.
Not for complete beginners
Second: This is not a project for someone who is just starting out with computers, computer science, or programming. You certainly don't need a Ph.D. to understand the information in this book, but you will need to have strong intermediate knowledge about programming concepts (variables, constants, functions, data structures) to be able to get this project off the ground.
If there are two topics I would recommend to brush up on, then it would be the binary system and bitwise operations. There are two chapters in the appendix that describe them:
The binary system
Bitwise operations
If you are not already well versed in these topics then I highly urge you to study these first, thoroughly. Practice with this until you get it down without having to think too much. Feel free to find more information about this on the internet or YouTube, if the explanations in the above two chapters are not enough.
If you have a strong background in low(er) level programming languages such as C, C++, Rust, or even classic Pascal, this book and the subject of chess programming will be easier for you. If you are coming from the web-development world, used to Javascript, Python or PHP, it will (probably) be harder, but certainly not impossible. (You can write chess engines in Javascript and Python, but those languages wouldn't be my personal choice.) If you just learned to write "Hello World", this probably will not be the project for you... yet. Of course, you're invited to come back later.
Mind your language
Third: If you are not experienced in several different programming languages of different styles and/or low level programming, then it would be best to pick a language you know well. This will spare you the burden of learning chess programming and a new programming language at the same time. Only pick a language you don't know because you want to use this project to learn it, like I did with Rust. Even so, you should have intermediate to early-advanced programming skills in at least one other language, so you don't need to learn all the basics from scratch. Be prepared for a steep learning curve and some rewrites, in case you pick a language you don't know, or don't know well.
Be happy with your environment
Four: Make sure your development environment is set up correctly. This is a topic which is not discussed in this book, apart from the chapter where instructions are given on how to build Rustic for different operating systems. It's impossible to give advice here, because there are so many IDE's (Integrated Development Environment), compilers, debuggers, operating systems, and programming languages to choose from.
Make sure you are able to compile/run "Hello World" in your programming language of choice. Make sure your editor/IDE has a debugger configured correctly, so you can set breakpoints and halt the program at that point. This lets you inspect variables and debug the program. If you haven't got a working development environment already, search the internet to find out how to set one up. My personal preference is to use Visual Studio Code as an IDE, Rust as a programming language (obviously), and the VSCode plugin rust_analyzer, which gives you code completion and linting for Rust in VSCode.
Git those versions under control
Five: This is not strictly necessary per se, but I strongly recommend using the a version control system such as Git, even if you are working alone. Versioning your code makes the development process a lot easier, not to mention safer. Safer? Yes. You can just create a branch to write some experimental code. If it doesn't work, you just throw it away, instead of trying to undo everything you did.
I mean it. Don't skimp on this. Many IDE's and editors have provisions for Git and other version control systems built in, or they provide plugins or extensions. If you don't like built-in version control or your favorite eidtor doesn't support it, then there are many GUI-applications to be found; open-source, freemium and paid. If you are really a contrarian, you can also use choose a different version control system. It doesn't matter, as long as you use something.
Know something about chess
Six: Also not strictly necessary, but it would be very helpful if you know all the chess rules already. (If not, you can download and read the FIDE Laws of Chess Handbook.) As with other programming projects, it is good to know the basics of the domain you're working in. Think about the movement of the pieces, castling, en-passant, and the different draw rules. You could just find all the rules and implement them, but you wouldn't be able to understand what the engine is doing, and finding bugs is going to be harder. It would even be better if you could play some decent chess already. Even beginner level between 1000-1200 Elo should be sufficient. At that point you're probably not going to make any mistakes against the rules anymore, so there's less chance you'll implement the rules the wrong way.
Still here? Great. Now we start. Good luck.
Introduction
Concept
What this book is about, is writing the chess engine, not the chess program. There is a distinct difference. Most people know chess programs in one way or another; many operating systems even include one in their default installation. So what is the difference?
The part of the software that people normally call the "chess program" is only the user interface, which the player uses to interact with the engine. In the past, the user interface and the engine were often the same piece of software, but nowadays this is mostly not the case. The user interface cannot play chess, while the engine has only very limited means to interact with a user, from a user's point of view.
So, they go together, like this:
The chess engine is on the left, the user interface is on the right. They communicate with one another. This is done through strict rules, which are called a "protocol." In the chess programming world, there are two main protocols:
- XBoard: designed by Tim Mann and others in the early 1990's.
- UCI: designed by Stephan Meyer-Kahlen in 2001.
Both protocols are designed to do the same thing: create a standardized way for a user interface and a chess engine to communicate with one another. While there are advantages and disadvantages to either protocol, both can accomplish this task equally well. In the "Communication" chapter, we'll look deeper into both protocols.
Why do it like this? Well, because both protocols are standardized, user interfaces and engines that understand the same protocol, should be compatible. This means that chess programs and engines are interchangeable; one program can run many engines, and one engine can run under many different programs. This way, the user will be able to select the chess program he likes best, and then also use his favorite engine.
Design
As described on the Concept page, we basically have two pieces of software: the chess engine, and the chess program. (Henceforth, the "chess program" will simply be called GUI, which stands for Graphical User Interface.) These two programs communicate with one another using either the UCI or the XBoard protocol.
If a GUI implements one or both of these protocols, we can assume that it does so correctly. This means we can treat all the GUI's the same: we "just" have to write the engine. As we're no longer having to pay any attention to the GUI from this point onward (apart from the commands we're going to send and receive), we can focus on the engine. Let's unpack the Engine component from the concept diagram we saw on the previous page:
Wow. That got complicated fast. Don't panic. Just lets list all the shown components and shortly describe what they do. Let's start with the meaning of the colors:
- Green: These are distinct, running programs.
- Light-blue: These are active objects that contain threads.
- Yellow: These are passive objects. They hold mutate, or provide data.
- Orange: These are interfaces. If a component adheres to an interface, it becomes interchangeable with other components adhering to the same interface.
- Pink: These are important functions within the engine.
The diagram shows all the important components an engine needs to play proper chess.
- Engine: This is the main Engine program / object, that connects everything. It sits in the middle, like a spider in a web. No object has information about any other object, except if the Engine gives it to them. The engine knows all, and controls all.
- MoveGenerator: Generates pseudo-legal moves when requested.
- Board: Holds the current board and game state.
- Search: Searches for the best move in the position.
- Evaluation: Is called by Search, for evaluating positions during search.
- UCI: Implements the UCI-protocol (optional if implementing XBoard).
- XBoard: Implements the XBoard-protocol (optional if implementing UCI).
Two components shown are optional:
- Transposition Table (TT): The transposition table stores moves and related information the engine has found during the search. Not strictly necessary, but it is a huge speed and strength booster.
- IComm (optional): This is a communication interface. The active objects UCI and XBoard implement this interface, thus they become interchangeable. This can be omitted if implementing only one protocol.
There are also some objects in the engine that are not yet visible in the diagram above. They have been omitted because they are either part of one of the other objects, or they are not needed for making the engine playing chess. These objects will be discussed when we get to implementing the features that need them.
Roadmap
A chess engine is quite a complex piece of software. It is not possible to "just" write a chess engine off the bat, except if you've got lots of experience in writing chess engines or similar software. (In that case, you're probably reading the wrong book.) Therefore this chapter will describe a roadmap to get your new engine to a playing state.
First, the engine needs to be designed. If necessary, reread the chapters "concept" and "design", where Rustic's overall structure is explained. In the roadmap below, the big parts of this architecture are going to be unpacked into smaller parts which can be implemented one at a time.
The roadmap will describe how to get the engine to play somewhat decent chess. From that point onward, it "just" becomes a matter of adding features to make the engine stronger. If your overall design and implementation is well done, adding newer features on top of the baseline version is much less difficult than getting to the baseline itself.
Don't worry if you encounter terms in the roadmap you don't yet understand. When we get to the design and implementation, these will be explained.
- Design and write the board representation
- Represent the board
- Represent the pieces
- Read a position from an FEN-string
- Keep the current state of the game
- Keep a history of played moves
- Implement Zobrist hashing
- Create functions to control the board
- Create functions to get information from the board
- Design and write the move generator
- "Teach" the move generator how the pieces move
- Create a move format
- Generate moves for all the pieces when the MG is given a position
- Take special moves into account (such as castling)
- Adds generated moves to a move list and returns this when done
- Can generate captures and silent moves separately
- Can determine if a square is attacked
- Add perft (performance testing)
- Add a perft function that runs on a given position.
- Run through a perft suite containing "tricky" positions.
** First milestone reached: move generator is bug-free **
- Write the search functionality
- Structs (information) needed by the search
- Write the iterative deepening function
- Write the Alpha-Beta function
- Write the Quiescence search
- Make fixed search possible (on ply, nodes, or time)
- Implement time management for non-fixed searches
- Helper functions such as determining if a position is draw
- Write the evaluation function
- Material counting
- Piece Square tables (PST or PSQT)
- Order the moves for more speed
- MVV-LVA (Most Valuable Victim-Least Valuable Attacker)
- Design a communication interface ("IComm" in Rustic)
- Write the UCI-protocol
- Write the XBoard protocol (optional)
- Make sure the engine understands the commands
- Make sure the engine reacts correctly
** Second milestone reached: baseline is done. 1500-1700 Elo **
- Add more advanced functionality
- Transposition table
- TT move ordering
- Principal Variation Search
- Killer moves
- Tapered Evaluation
- Texel tuning
** Third milestone reached: the engine should be >= 2000 Elo **
- Keep adding more features to become stronger
- History heuristics
- Pruning capability
- Null move pruning
- Futility pruning
- Mate pruning
- ... etc ...
- Evaluation
- Mobility
- King safety
- Passed pawns
- General knowledge
- ... etc ...
- Many other features
Do not begin to extend the engine with new features until the baseline as outlined above has proven to be bug-free. It should be able to complete a perft suite without errors and it should be able to play thousands and thousands of test games in a row without crashing, making illegal moves or forfeiting on time.
Go back and read the above paragraph again. This is absolutely essential. If your baseline engine is not bug-free, any and all features you add next will either not bring you the improvement they should, or they won't work at all. This also means you should only move on to the next feature if the previous feature has proven to be working correctly.
Fixing bugs afterwards, in a feature you added weeks or even months ago is difficult in a chess engine; much more so than it is in "normal" software. The reason is that many features are synergetic; this means that the behavior or effectiveness of one feature can be altered by another. If you add a new feature and it doesn't work right, it could be because of a bug in an older piece of code... but you won't know until you check all the code.
After you hit the second milestone, you should have an engine that is on par with Rustic Alpha 1 with regard to functionality. The engine has no chess knowledge apart from PST's and material values. You can use Rustic Alpha 1 versions to get started with engine to engine testing.
If your engine has no bugs, and you're using Rustic's PST's and material values, you can expect an rating of 1500-1700 Elo after the engine is tested in a list such as the CCRL 2m+1s Blitz list. Rustic Alpha 1 has a rating of 1675 Elo.
In case your baseline engine reaches less than 1500 Elo, it's either very slow, buggy, or both. First make sure there are no bugs, then try to optimize for speed. This is the reason why you should choose a fast programming language; if you don't, you could start out with a 200 Elo disadvantage against engines with a similar feature set.
Should you be in the high 1600's, you could possibly hit 1700 by tinkering a lot with the material values, piece square tables, and time management. It's probably not worth it, as all those functions will either be improved or replaced in later versions, so don't spend too much time on this.
(I believe 1700 on the CCRL Blitz list is the about the maximum you could reach with the given baseline version that has only material values, PST's, MVV-LVA move sorting, and no special search or move generator optimizations.)
From this point onward, you can add features in any order you want, but I suggest to follow the roadmap up to the third milestone. There are features that depend on other features. To implement transposition table move ordering, you obviously first need to have a working transposition table.
Let's move on to the first topic: board representation.
Board Representation
Introduction
To be able to play the game, the engine has to know the board layout, where all of the pieces are located throughout the game, and all relevant information about each position. In the case of Western Chess, the board and pieces are the following:
- A board is 8 by 8 squares (64 in total), alternating in a light and dark color. The squares are normally called light/white or dark/black, even if the squares on a real board often have other colors.
- A set of pieces consists of 1 king, 1 queen, 2 rooks, 2 bishops, 2 knights and 8 pawns. The two sets have distinct colors to differentiate them from one another during play. Same as with the squares, these sets are normally called white and black, even if they have other colors.
Credits: Photographed it
myself using a Canon DSLR
Sidenote: Mentioning this explicitly may seem pedantic, but there are lots of different chess-like games: for example, there is Chinese Chess (XiangQi) and its cousin Korean Chess (Janggi), Japan has its own version called Shogi, which has different variants. Of course we have the ancestors of Western Chess, Shatranj and Chaturanga. If you still haven't had enough, there is Thai chess (Makruk) and even Fairy Chess with lots of different board sizes and pieces. A unique chess-like game is Arimaa, which was specifically created to be more difficult for computers than Western Chess. I'm sure I still didn't mention more than a few chess-like games, but I don't want this sidenote to occupy two screens.
And... for all of these games, engines have been written. Some engines even play more than one chess variant. The variant of chess the engine plays can influence the board representation you choose to use. Now you see why it's important to clarify that Rustic plays Western Chess. (In the future it may support one variant of Western Chess: Fischer Random Chess, also called Chess960.)
The nice thing is that, if you understand how to implement one variant, you can implement whatever variant you want... but with some variants, you may have problems finding a user interface to use with your engine.
The data structure used by the engine to track the location of the pieces and other information about the ongoing game is called the board representation. It can be constructed in many different ways. The most obvious and intuitive would be to just give every piece its own number, and put all of them in an 8x8 array, like this (in pseudo-code):
// Empty square
const ES = 0
// White pieces
const WK = 1; // King
const WQ = 2; // Queen
const WR = 3; // Rook
const WB = 4; // Bishop
const WN = 5; // Knight
const WP = 6; // Pawn
// Black pieces
const BK = 7;
const BQ = 8;
const BR = 9;
const BB = 10;
const BN = 11;
const BP = 12;
// Starting position
board = [
[BR, BN, BB, BQ, BK, BB, BN, BR],
[BP, BP, BP, BP, BP, BP, BP, BP],
[ES, ES, ES, ES, ES, ES, ES, ES],
[ES, ES, ES, ES, ES, ES, ES, ES],
[ES, ES, ES, ES, ES, ES, ES, ES],
[ES, ES, ES, ES, ES, ES, ES, ES],
[WP, WP, WP, WP, WP, WP, WP, WP],
[WR, WN, WB, WQ, WK, WB, WN, WR]
];
This is easy to understand and it will work. Lots of board representations have been devised in the last 60 years. See the page about Board Representation on the Chess Programming Wiki for more information.
Every board representation comes with its own strengths and weaknesses. A strength can be as simple as "easy to understand", while a weakness can be "uses lots of memory." So what board representation should we use?
In Rustic the choice has been made to use the bitboard approach (CPW page). In the beginning this is somewhat harder to understand compared to the array approach mentioned above, but once you get the hang of it this representation becomes very natural to work with. Also, current-day computers are 64-bit, and a (Western!) chess board has 64 sqaures... this seems a match made in heaven.
Now that you know what a board representation is and why we need it, we can start looking into how Rustic tackles this part of the chess engine.
Well... after you made sure you thoroughly understand the binary system and bitwise operations Last chance to study these topics before you're going to need them. A lot. Don't say I didn't warn you. I did: in the second prerequisite.)
The next chapters will explain how bitboards work and how the board representation is organized.
Bitboards
So, what is a bitboard? In the chess engine world, this is a 64-bit integer which is a representation of something on the board. "Something" can be a piece on a square, if a square is occupied or not, if a piece can move to that square, and many more things.
In Rustic, the bitboards least significant bit (LSB), is bit 0, and in the case of squares, it represents square A1. When printing a bitboard to the screen however, the square A1 is on the bottom left, so effectively, the bitboard s printed backwards in rows of 8, and upwards. Let's take a look at some concrete examples.
Let's put a knight on e4:
bitboard, LSB on the far right:
00000000 00000000 00000000 00000000 00010000 00000000 00000000 00000000
^e4 ^a4 ^a3 ^a2 ^a1
printed backwards and upwards:
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 <- h3
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 <- h1
^a1
This bitboard would have the value "268_435_456" internally, but we are (almost) never going to use these values directly. If you ever need to see what a bitboard holds, you will need to print it to the screen, because the debugger will just tell you the value. For example, what does this mean?
Reading FEN-strings
Zobrist Hashing
Game State
Game History
Detecting the bishop pair
It is very useful to know if one side or the other has a bishop pair, as having the two bishops can be an advantage in some positions. Some of these advantages include:
- You can checkmate a lone enemy king with the bishop pair, but not with two knights.
- The bishop pair can control two adjacent diagonals over a long distance. The enemy king cannot cross this double diagonal.
- Both bishops together can control any square on the board; if you lose one, that square color will be weakened.
- Bishops are stronger than knights on an open board. As you move towards the endgame, more and more pieces get traded. The board becomes more open, so having the bishop pair can be an advantage.
First we need to know what "having the bishop pair" even means. We can define it as follows:
- One side must have at least two bishops. (Can also be three or more.)
- One bishop must be on a white square, and another on a black square.
If a player has two or more bishops but they are all on the same colored squares, he does not have the bishop pair. This can happen if the player trades the dark_squared bishop (retaining the light-squared bishop) and later promotes a pawn on a light square to a bishop. That would leave the player with two light-squared bishops, which do not form a bishop pair.
Even if you have five bishops on the same colored squares, you don't have the bishop pair. In fact, if you have a king and five same-squared bishops against a bare king, you can't win the game.
The best way to determine if either black or white has a bishop pair is to just ask the board itself, so we should add a function to calculate this. It turns out to be very simple:
pub fn has_bishop_pair(&self, side: Side) -> bool {
let mut bishops = self.get_pieces(side, Pieces::BISHOP);
let mut white_square = 0;
let mut black_square = 0;
if bishops.count_ones() >= 2 {
while bishops > 0 {
let square = bits::next(&mut bishops);
if Board::is_white_square(square) {
white_square += 1;
} else {
black_square += 1;
}
}
}
white_square >= 1 && black_square >= 1
}
That's it. We get the bishops for either white or black. We have to end up with at least one on the white squares, and one on the black squares, so we must count how many of each we have.
If we have at least two bishops, we start looking if we have a pair. With one or zero bishops, we already know we won't have a pair. While we still have bishops, there's still hope. We get the square the current bishop is on, and then we determine if the square is white or black. We increment the appropriate value. At the end the function returns true if both white_square and black_square are at least 1.
The important part is the call to Board::is_white_square(). This function determines if a square is white. There's a nice trick involved there which needs some explanation. This is Board::is_white_square():
pub fn is_white_square(square: Square) -> bool {
let rank = square / 8;
let even_square = (square & 1) == 0;
let even_rank = (rank & 1) == 0;
(even_rank && !even_square) || (!even_rank && even_square)
}
It is very short, but also somewhat cryptic. The function accepts a square number from 0 to 63. First it determines the rank number on which this square is located. This ranges from 0 up to and including 7. So now we have the square number and the rank number. The function determines if those numbers are even.
If the rank number is even, but the square number uneven, we have a white square. Or the other way around: if the rank number is uneven but the square number is even, we also have a white square. To understand this, take a look at the sketch below.

There are two numbers in each square. In the upper left corner is the rank number the square belongs to, which runs from 0-7. In the lower right corner is the square number itself, which runs from 0-63. The blue and orange lins have these meanings:
- The blue lines indicate ranks and squares having an EVEN number.
- The orange lines indicate ranks and squares having an UNEVEN number.
If you take a closer look at the pattern these lines make, you'll see:
- The black squares either have a completely blue, or a completely orange
cross. This means that a square is black if:
- The rank and square numbers are both EVEN (blue cross).
- The rank and square numbers are both UNEVEN (orange cross).
- The white squares have either a blue/orange or an orange/blue cross. This
means that a square is white if:
- The rank number is EVEN and the square number is UNEVEN (blue horizontal line, orange vertical line).
- The rank number is UNEVEN and the square number is EVEN (orange horizontal line, blue vertical line).
As such, that is the result for the Board::is_white_square() function:
(even_rank && !even_square) || (!even_rank && even_square)
So now we can determine if a side has the bishop pair by making the following two function calls:
let white_bishop_pair = board.has_bishop_pair(Sides::WHITE);
let black_bishop_pair = board.has_bishop_pair(Sides::BLACK);
Very nice. This will be very useful in determining if the position is a draw or not. It can also be used as a term in the the evaluation function to take the presence of the bishop pair into account.
Detecting cannot force mate
In the section Detecting draws by FIDE rules the "Draw by insufficient material rule." is explained and implemented. This rule makes it possible to claim a draw due to having insufficient material to win the game. However, that rule states that a draw can only be claimed if there is no way to achieve checkmate by any order of legal moves.
There are some material combinations such as King+Knight vs. King+Knight where a mate can be achieved by legal moves, even though one of the players has to assist in getting mated. In case of these material combinations a draw cannot be claimed under the FIDE rules. When implementing XBoard, the engine has to know about this.
The function to detect this situation is draw_by_insufficient_material_rule().
When searching however, the situation is a bit different. Even though the game cannot be claimed as draw in the King+Knight vs. King+Knight situation, it is, in practice, still a draw because checkmate cannot be forced. We need a different function to detect such conditions, which is then used during the engine's search routine. It can be used to score a position where mate cannot be forced as a draw. The engine will actively try to achieve such a position when it is losing or try to avoid it when it is winning.
This function is sufficient_material_to_force_checkmate().
Checkmate can still be forced if one of the following conditions is true:
- There is still a queen, rook, or pawn on the board. (The pawn can potentially promote to a rook or a queen in the future, which can then be used to force a checkmate.)
- At least one player has the bishop pair. This means that one player has to have two bishops, and they must be on squares of different color. (See: Detecting the bishop pair).
- At least one player has both a bishop and a knight.
- At least one player has at least three knights, which can be achieved by promoting a pawn. Granted, this is a super-rare situation, but with three knights against a lone king, checkmate can be forced.
The following function checks the above conditions. If any of them applies it returns true to indicate that checkmate can still be forced. It can be found in the file draw.rs:
pub fn sufficient_material_to_force_checkmate(&self) -> bool {
let w = self.get_bitboards(Sides::WHITE);
let b = self.get_bitboards(Sides::BLACK);
w[Pieces::QUEEN] > 0
|| b[Pieces::QUEEN] > 0
|| w[Pieces::ROOK] > 0
|| b[Pieces::ROOK] > 0
|| w[Pieces::PAWN] > 0
|| b[Pieces::PAWN] > 0
|| self.has_bishop_pair(Sides::WHITE)
|| self.has_bishop_pair(Sides::BLACK)
|| (w[Pieces::BISHOP] > 0 && w[Pieces::KNIGHT] > 0)
|| (b[Pieces::BISHOP] > 0 && b[Pieces::KNIGHT] > 0)
|| w[Pieces::KNIGHT].count_ones() >= 3
|| b[Pieces::KNIGHT].count_ones() >= 3
}
During the search the engine will call this function after every move. As soon as it returns false, the engine scores the evaluation of the position as a draw and it will be useless to search any deeper.
Detecting draws by FIDE rules
To be able to play chess properly, the engine has to be able to detect draws. In addition, to fully support the XBoard protocol (see the chapter about communication), it has to be able to claim these draws. This chapter describes how to use the board representation to detect them. The functions listed in this chapter can be found in the file draws.rs in the Board-module.
The FIDE Handbook of the Laws of Chess can be downloaded here: download (FIDE-site)
Draw by fifty moves rule
This is stated as follows:
The game is drawn, upon a correct claim by a player having the move, if:
... he writes his move, which cannot be changed, on his scoresheet and declares to the arbiter his intention to make this move which will result in the last 50 moves by each player having been made without the movement of any pawn and without any capture, or 9.3.2 the last 50 moves by each player have been completed without the movement of any pawn and without any capture.
And important part in this rule is "by each player": both players have to make 50 moves without a capture or a pawn move for a draw to be claimed. Many people stop playing after 50 moves in total, which would thus be 25 moves by each player. This is incorrect.
The board keeps a game state, which contains the so-called "half-move clock". This is a counter, which counts the number of ply's (a "ply" is one move by one player) that have been played without capturing a piece or moving a pawn. As soon as a piece is captured or a pawn is moved, this counter is reset to 0. Therefore a draw can be reported very easily:
pub fn draw_by_fifty_move_rule(&self) -> bool {
self.game_state.halfmove_clock >= MAX_MOVE_RULE
}
MAX_MOVE_RULE is set to 100, because each player has to play 50 moves, which is 100 ply in total. When this function is called it returns true if the game, at that point, is drawn by the fifty move rule.
Draw by three-fold repetition rule
This rule is one of the more extensive draw rules, stated as follows:
The game is drawn upon a correct claim by the player having the move, when the same position, for at least the third time (not necessarily by a repetition of moves):
a. is about to appear, if he first writes his move on his scoresheet and declares to the arbiter his intention to make this move, or...
b. has just appeared, and the player claiming the draw has the move.
Positions as in (a) and (b) are considered the same, if the same player has the move, pieces of the same kind and colour occupy the same squares, and the possible moves of all the pieces of both players are the same. Positions are not the same if a pawn that could have been captured en passant can no longer be captured in this manner. When a king or a rook is forced to move, it will lose its castling rights, if any, only after it is moved.
To be able to follow this rule, the engine has to have Zobrist hashing implemented already, so positioins can be uniquely identified. Everything in the position has to be the same. This includes "available moves", "player to move", "en passant", and "castling rights" as mentioned above.
Example: if the white king could castle in a certain position, then moves one square from E1 to D1 and back again from D1 to E1, a position may be reached that looks the same. It isn't, because the white king has lost its castling rights, so the moves available to the white player are not the same anymore. The Zobrist-keys take this and other nuances into account.
What we have to do is compare the position that is on the board at this very moment, to the positions that have been seen in the past. The engine's game state keeps a history array, which contains the Zobrist-keys for every position that has been on the board since the beginning of the game.
To determine if the game is a draw, we walk through this history array backwards. Each time we encounter the same Zobrist-key as the one we have for the current position, we increment a counter.
We can stop searching for repetitions as soon as we encounter a pawn move or a piece capture, because such a move irrevocably changes the position. Therefore our current position can never have been on the board before this point.
This is the function:
pub fn draw_by_repetition_rule(&self) -> u8 {
let mut count = 0;
let mut stop = false;
let mut i = (self.history.len() - 1) as i16;
while i >= 0 && !stop {
let historic = self.history.get_ref(i as usize);
// If the historic zobrist key is equal to the one of the board
// passed into the function, then we found a repetition.
if historic.zobrist_key == self.game_state.zobrist_key {
count += 1;
}
// If the historic half move clock is 0, it indicates that
// this position was created by a capture or pawn move.
stop = historic.halfmove_clock == 0;
// Search backwards
i -= 1;
}
count
}
We can claim a draw by threefold repetition if this function returns at least 2: the current board position has been encountered twice before, so the one we have on the board right now is the third repetition.
Draw by insufficient material rule
Note: If you are only implementing the UCI protocol, then you can omit this function. It is only needed for the XBoard protocol to be able to claim draw by insufficient material under the FIDE rules. The engine also has a function called sufficient_material_to_force_checkmate(), which is used while searching and can score drawn positions, which is sufficient for UCI.
There is a rule in chess which is often misunderstood: draw by insufficient material. It seems simple: if none of the players have enough material on the board to checkmate the king, a draw can be claimed. However, this is not exactly what handbook says. The relevant rule is as follows:
The game is drawn when a position has arisen in which neither player can checkmate the opponent’s king with any series of legal moves.
This is slightly different from the commonly held belief of "not having enough material to checkmate the enemy king can be claimed as a draw." If you have a King+Bishop vs. King, then the position is a draw, because no player can checkmate the other using any series of legal moves. What about King+Bishop vs. King+Bishop, or King+Knight vs. King+Knight? These positions should be drawn, because no player can achieve a mate, right? Wrong...
K+B vs. K+B:
K+N vs. K+N:
Foul play! I hear you cry. Black must have assisted white in achieving these mates, because they cannot be forced! True enough: these mates cannot be forced. Black must assist white by blocking his own king with his last move, so white can deliver mate. However, these positions are achievable through a series of legal moves. This means that a player can not claim a draw on the basis of insufficient material. Of course, both players can agree on a draw because they both know that the other will not assist in getting mated. Even if one of the players refuses to agree on a draw, the game will eventually run into the 50-move rule and be drawn anyway.
Because of the above, the function which implements this rule (draw_by_insufficient_material_rule()) is different from the function used during searching (sufficient_material_to_force_checkmate()). This function determines if it is legal, according to the rules of chess, to claim a draw and end the game. The function used during searching determines if one of the players is still able to force a checkmate, and if not, to score the position as a draw.
Note that no mate can ever be achieved in a King+Bishop vs. King+Bishop position, if both bishops are on squares of the same color. Any position with that material is therefore claimable as a draw.
The function implementing this rule is fairly simple. It just lists all the positions in which a draw by insufficient material can be claimed:
pub fn draw_by_insufficient_material_rule(&self) -> bool {
// Get the piece bitboards for white and black.
let w = self.get_bitboards(Sides::WHITE);
let b = self.get_bitboards(Sides::BLACK);
// Determine if at least one side has either a Queen, a Rook or a
// pawn (qrp). If is the case, a draw by rule is not possible.
let qrp = w[Pieces::QUEEN] != 0
|| w[Pieces::ROOK] != 0
|| w[Pieces::PAWN] != 0
|| b[Pieces::QUEEN] != 0
|| b[Pieces::ROOK] != 0
|| b[Pieces::PAWN] != 0;
// No queens, rooks or pawns. We may have a draw.
if !qrp {
// King vs. King
let kk = w[Pieces::BISHOP] == 0
&& w[Pieces::KNIGHT] == 0
&& b[Pieces::BISHOP] == 0
&& b[Pieces::KNIGHT] == 0;
// King/Bishop vs. King
let kbk = w[Pieces::BISHOP].count_ones() == 1
&& w[Pieces::KNIGHT] == 0
&& b[Pieces::BISHOP] == 0
&& b[Pieces::KNIGHT] == 0;
// King/Knight vs. King
let knk = w[Pieces::BISHOP] == 0
&& w[Pieces::KNIGHT].count_ones() == 1
&& b[Pieces::BISHOP] == 0
&& b[Pieces::KNIGHT] == 0;
// King vs. King/Bishop
let kkb = w[Pieces::BISHOP] == 0
&& w[Pieces::KNIGHT] == 0
&& b[Pieces::BISHOP].count_ones() == 1
&& b[Pieces::KNIGHT] == 0;
// King vs. King/Knight
let kkn = w[Pieces::BISHOP] == 0
&& w[Pieces::KNIGHT] == 0
&& b[Pieces::BISHOP] == 0
&& b[Pieces::KNIGHT].count_ones() == 1;
// King/Bishop vs. King/Bishop
let kbkb = w[Pieces::BISHOP].count_ones() == 1
&& w[Pieces::KNIGHT] == 0
&& b[Pieces::BISHOP].count_ones() == 1
&& b[Pieces::KNIGHT] == 0;
// Both bishops have to be on the same colored square for a
// draw to be claimable.
let same_color_sq = if kbkb {
let wb_sq = w[Pieces::BISHOP].trailing_zeros() as usize;
let bb_sq = b[Pieces::BISHOP].trailing_zeros() as usize;
Board::is_white_square(wb_sq) == Board::is_white_square(bb_sq)
} else {
false
};
if kk || kbk || knk || kkb || kkn || (kbkb && same_color_sq) {
return true;
}
}
// All other cases cannot be claimed as a draw.
false
}
Search
Introduction
Move ordering
The reason
Why would you want to order the moves the engine is going to search? The answer to that may not be immediately obvious. However, if you take a moment to take into account how a human thinks about a position, everything will become clear. Let's review an example that could occur in an actual chess game. Consider the following position, with white to move:

There are a lot of possible moves to be played. The rook and the queen have 22 moves together, and we're not even counting the king, the knight, and the pawns. You may ask yourself: Which moves should I consider first? That is a perfectly valid question, because moves are not of equal value. For example:
- a2-a3? Nah. That move does nothing.
- Nf5xg7, winning a pawn? No. The black king will just recapture.
- Maybe Nf5-e7+, attacking the king and the rook at the same time?
The last move seems to be great: it attacks the black king AND the rook, and after the king moves, the rook can be captured. Great. But wait... there's bigger fish to fry.
Rc2xc8!
At first glance, this move only seems to trade rooks after Qb8xc8, but you have to realize that it's now the black queen on c8, and the move Nf5-e7+ is still on the board: the knight now attacks the black king and the queen, instead of the rook, for an even greater gain.
So, what if black doesn't take the white rook, moving the queen off the back rank with Qb8-a7? White could still play Nf7-e7+, but now that would be a mistake. The better move is:
Rc8xe8#!
Black is checkmated. The black queen is unable to move off the back rank and keep the knight defended at the same time. (Note that Qb8-e5 does not work, because white will just capture with d4xe5.) Qb8xh2+ is possible because it checks the white king, but then white will just recapture, so that's no good either. Worse, the Rc8xe8 checkmate threat is still on the board.
Thus after Rc2xc8, the response Qb8xc8 is the only good move, even though white will follow it up with Nf5-e7+, which wins the black queen and the game. (You could therefore say that Qb8xc8 is the least bad move instead of the best move.)
The takeaway from this example is that most moves on the board are not even considered by humans. If you're in the middle of an attack such as in the position discussed above, it's unlikely you're going to consider moves such as a2-3. The order in which moves are mostly evaluated is the following:
- Captures are evaluated before quiet moves.
- Good captures are evaluated before bad captures.
- Checks come before completely quiet moves.
- Quiet moves will only be evaluated if there are no immediate threats to to make against the opponent's king, or no threats against the king or one's one pieces needs to be addressed.
Using techniques like these, humans narrow down the list of moves to examine. A chess engine has to do the same thing, or it will be examining a lot of useless moves. This slows it down in finding good moves and it will be weaker because of that. If the engine has a way of examining good moves first, it can skip searching huge chunks of the move tree, which will be saving a lot of time in finding the promising variations. It can use this saved time to search deeper. The engine will become stronger as a result.
In this chapter, we'll discuss various techniques on how to order moves during the search. To make the gains of move ordering more tangible, an example of a search with and without move ordering will be provided in the MVV-LVA chapter.
How move ordering works
Move ordering is a very important part of a chess engine, as described in the previous chapter. There are several techniques involved, but if you get right down the basics, what happens is quite easy to understand. Let's take the alpha-beta function, which performs the core part of the search in a chess engine. To keep this simple, all parts not related to move ordering have been removed.
pub fn alpha_beta(
mut depth: i8,
mut alpha: i16,
beta: i16,
pv: &mut Vec<Move>,
refs: &mut SearchRefs,
) -> i16 {
// Early alpha/beta exit conditions are here
...
// Generate all the moves for this position.
refs.mg .generate_moves(refs.board, &mut move_list, MoveType::All);
// Do move scoring, so the best move will be searched first.
Search::score_moves(&mut move_list, tt_move, refs);
// Loop through all the moves, where moves will be picked.
for i in 0..move_list.len() {
// Put the move with the highest sort score at the loop's index.
Search::pick_move(&mut move_list, i);
// Get this move and examine it.
let current_move = move_list.get_move(i);
// Search logic here ...
...
...
}
// We found the value for our best move. Return this.
alpha
}
This is basically it. The alpha-beta function receives everything it needs for its search. These will be explained in detail on the page about the alpha/beta-function. Further down, the function does the following things:
- Generate all the moves
- Call score_moves()
- Pass a reference to the move list that was just generated.
- Pass the transposition table move.
- Pass the collection with references that hold information about the search.
- score_moves() iterates over the move list, giving each move its own sort score (using the SORTSCORE field in the move integer).
- Then alpha/beta starts examining moves, by iterating over the move list itself. The loop starts counting at 0 (first move in the list), and obviously ends with the last legal move in the position.
- pick_move() puts the move with the highest sort score at the index where the move loop currently is.
- Then that move is put into current_move, so we don't have to move_list.get_move(i) over and over again in the loop.
So, only points 2 and 5 are involved in ordering, and then picking the moves. We'll see score_move() in several different versions when we discuss the move ordering techniques, but pick_move() is always the same:
pub fn pick_move(ml: &mut MoveList, start_index: u8) {
for i in (start_index + 1)..ml.len() {
if ml.get_move(i).get_sort_score() > ml.get_move(start_index).get_sort_score() {
ml.swap(start_index as usize, i as usize);
}
}
}
It receives a reference to the move list, and the index it should start at. This is the index where alpha/beta's move loop currently is. Then pick_move() runs through the list, and keeps swapping moves with higher scores into the start_index location. In the end, the move having the highest sort score will end there, so alpha/beta will examine this move next. (This function could be optimized a tiny bit to do only ONE swap, but for some reason, this made Rustic slower each time I've tried it. Maybe this optimization will make it in later.)
Sidenote Why do we assign sort scores to the moves, and then use pick_move() to swap one move to the current index of the move list while alpha/beta iterates over it? Couldn't we just physically sort the list before the move loop starts, and be done with it?
It's possible, but we don't do that because of how alpha/beta works. If alpha/beta encounters a move that is so good that searching further down the move list is no longer required, then it will exit and return the evaluation score of that move. (This is a so-called beta-cutoff.) If you physically sorted all the moves before the move loop starts, you may have sorted lots of moves that may never be examined by alpha/beta. This costs time, and thus it makes the engine slower.
You can hear and read "move ordering" and "move sorting" interchangeably. The difference is that "move ordering" does the "score and pick" approach, and "move sorting" physically sorts the entire move list. The result is the same, but move ordering is the faster approach, as described above.
MVV-LVA
(Implemented since Rustic Alpha 1.)
MVV-LVA stands for Most Valuable Victim, Least Valuable Attacker. This is a move ordering technique that does exactly what it describes: it orders the capture moves, ordered from the strongest to the weakest. The more valuable the captured piece is, and the less valuable the attacker is, the stronger the capture will be, and thus it will be ordered higher in the move list. As a consequence, the alpha-beta function will search the stronger captures first, which causes it to find better moves faster, and thus it can disregard large portions of the search tree.
In Rustic, MVV-LVA has been implemented since the first version, Alpha 1. In version Alpha 2, it still is the only move sorting technique. It is recommended that you implement this function directly after you get iterative deepening and alpha-beta working, because it is instrumental to getting your engine to play decent chess.
The function is implemented as a two-dimensional array:
// MVV_VLA[victim][attacker]
pub const MVV_LVA: [[u8; NrOf::PIECE_TYPES + 1]; NrOf::PIECE_TYPES + 1] = [
[0, 0, 0, 0, 0, 0, 0], // victim K, attacker K, Q, R, B, N, P, None
[50, 51, 52, 53, 54, 55, 0], // victim Q, attacker K, Q, R, B, N, P, None
[40, 41, 42, 43, 44, 45, 0], // victim R, attacker K, Q, R, B, N, P, None
[30, 31, 32, 33, 34, 35, 0], // victim B, attacker K, Q, R, B, N, P, None
[20, 21, 22, 23, 24, 25, 0], // victim N, attacker K, Q, R, B, N, P, None
[10, 11, 12, 13, 14, 15, 0], // victim P, attacker K, Q, R, B, N, P, None
[0, 0, 0, 0, 0, 0, 0], // victim None, attacker K, Q, R, B, N, P, None
];
The two-dimensional array is addressed in the victim-attacker order, just as MVV-LVA says. Each combination has a value. For example:
let value1 = MVV_LVA[Pieces::BISHOP][Pieces::PAWN];
In this case, the victim is a Bishop, while the attacker is a pawn. The value of this capture as provided by the MVV-LVA array, will therefore be 35. Another capture could be:
let value2 = MVV_LVA[PIECES::QUEEN][Pieces::ROOK];
Now, the a Queen can be captured by a Rook. The value of this capture would thus be 52. If the engine has a choice between both captures, the second one will be ordered higher in the move list. The engine will search it first, because capturing the queen using a rook is probably better (5 points of material gain), than capturing the bishop using a pawn (2 points of material gain.)
Sidenote You may be wondering about the values in the table: why did I choose 10, 11, 12.... and then 20, 21, 22... for the next row? The simple answer is that these values are arbitrary. The only thing that counts is to make sure that the values for the better captures are higher. Instead of 10, 11, 12..., I could have chosen 137, 180, 231,... and then for 20, 21, 22,... the values could have been 400, 473, 474. It doesn't matter. Personally, when choosing values like these, I like to keep them as small as possible. Then they are easier to understand, easier to edit if necessary, and they fit into smaller integers.
This is how the ordering is implemented, when MVV_LVA is the only sort scoring functionality in the engine:
pub fn score_moves(ml: &mut MoveList) {
for i in 0..ml.len() {
let m = ml.get_mut_move(i);
let value = MVV_LVA[m.captured()][m.piece()];
m.add_score(value);
}
}
I already said that implementing this optimization would be easy. We just run through the move list, grab the combination of the victim and attacker from the MVV_LVA-array, and we add it to the move's score. (Because this is the only place in the engine where moves are scored for sorting, the member function add_score() is renamed to set_sort_score() in later versions of Rustic, so you may see set_sort_score() in other move sorting chapters.)
An example of the effectiveness of this implementation is given below. In the first run, Rustic Alpha 1 is searching a position without any move ordering in place. In the second run, MVV-LVA has been enabled as the one and only move ordering technique. (For brevity's sake, part of the output string has been truncated, as the extra information is not relevant for this example.)
The FEN of the position is:
r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1
Run without MVV-LVA move sorting:
info score cp 25 depth 1 seldepth 5 time 878 nodes 5161544
info score cp 25 depth 2 seldepth 5 time 6429 nodes 38369245
info score cp 20 depth 3 seldepth 5 time 121942 nodes 722743000
Run with MVV-LVA move sorting:
info score cp 25 depth 1 seldepth 3 time 0 nodes 1598
info score cp 25 depth 2 seldepth 3 time 1 nodes 3196
info score cp 20 depth 3 seldepth 5 time 2 nodes 7315
info score cp 20 depth 4 seldepth 5 time 6 nodes 20260
info score cp 5 depth 5 seldepth 9 time 25 nodes 76603
info score cp 20 depth 6 seldepth 7 time 109 nodes 293985
info score cp 5 depth 7 seldepth 9 time 543 nodes 1333835
info score cp 5 depth 8 seldepth 9 time 2900 nodes 7288058
info score cp -40 depth 9 seldepth 11 time 16933 nodes 39339223
The first run without MVV-LVA took over 2 minutes to reach depth 3 in the given position. The second run, with MVV-LVA enabled, took only 16.9 seconds to reach depth 9. The cause of this is the number of nodes that is being saved in the search. In the first run, the engine searched 722.743.000 (722.7 million) moves to reach depth 3. In the second run, the engine only searched 7.315 nodes to reach the same depth. Even at depth 9, the number of nodes is 'just' 39.339.223 (39.3 million), which is a fraction of the 722 million from the first run.
Granted, MVV-LVA is one of the techniques that brings the greatest improvements. That is the reason why it is recommended to implement it first. On top of this technique, further move ordering enhancements can be implemented which will be discussed in the next chapters.
Sidenote So, what about the 0's in the table? The number of elements for the rows and columns in the table is "NrOf::PIECE_TYPES + 1". There are 6 piece types in chess: King, Queen, Rook, Bishop, Knight, Pawn. Rustic also has a piece type called Pieces::NONE, which is used when there's no piece on a square, or, in the move integer, there is no piece in the Promotion or Captured fields. That is where the +1 comes from.
In MVV_LVA move ordering, Piece::NONE cannot be the victim and it also cannot be the attacker. You can't capture TO an empty square. You also can't capture FROM an empty square. Therefore all the cells that have Piece::NONE as part of the combination are 0. (A move FROM an empty square isn't even possible... supposing your engine has no bugs in the move generator.) The top row is also 0, because the King cannot be captured.
Setting up the MVV_LVA array this way makes the scoring function somewhat simpler because you don't have to check if the FROM and TO square contain pieces and that you're not capturing a king.
Killer move heuristic
(Implemented since Rustic Alpha 3.)
Killer moves are another big time saver when searching. In some positions, they can have an effect as big as ordering on the transposition table move. Because killer moves are much easier to implement than a transposition table, I would advise to implement them before implementing the TT, instead of after, as I did in Rustic. (I just wanted the TT really badly because of its huge Elo-boost, and I felt particularly greedy at the time 😎)
Explanation
First let's take a look at what a killer move even is.
A "killer move" is a quiet move that does not capture anything, but brings a large advantage for the side making that move. The move is so named because it is so strong that the opponent must react to it, and thus it "kills" lots of moves the opponent may have. Let's consider a simple position in which nothing much is happening, but white is a bit better. White is to move, but obviously, black is also thinking.

Assume black is thinking to improve the position of his knight: "My knight is on the rim. I remember from my lessons: a knight on the rim is dim. When it is my move again, I'm going to play it to f6, and then to d5. It will be in the middle of the board, where it is much more active."
However, it is white to move, and he makes the unassuming move f3-f4. If black fails to consider the change in position after the deceptively simple pawn move and blindly goes ahead with his plan of Nh5-f6, white will respond with:
Bg2-c6!

This Bishop move, which came available after f3-f4, attacks both the rook and the queen. This is the "killer move", because it "kills" all the black moves that do not take this threat into account. After f3-f4, this move is available after several moves black may play, such as Nh5-f6, or g7-g6, or a6-a5.
This is obviously a very simple position: most moves black could play are obviously not that great. The point is still the same though: the move Bg2-c6! is a good move after a variety of black moves. This can be defined as follows:
A "killer move" is a quiet, non-capturing move which can cause a beta-cutoff in different branches of the tree at the same ply.
This may not be very intuitive. A diagram will probably help to clarify. See the following diagram below:
Let's assume the search finds the move f3-f4 for white at ply 5 deep, and it plays it on the board. The search will then iterate through moves for black at ply 6, and Nh5-f6 will be among them. Next it's white's turn again, at ply 7, and the search finds the move Bg2-c6! with big material gain.
Because of this big material gain, Nh5-f6 can't be played (it is "killed"), because it doesn't take this threat into account.
The search now assumes: "Black just played Ng5-f6 at ply 6. White's move Bg2-c6 at ply 7 is a great move. Maybe this is also good after some other moves black may play at ply 6? Let's keep this in mind. We (white) try this move first, each time it's my move at ply 7!"
This assumption is correct: the move is a good response against at least two other black moves, so it "kills" them too. In those two branches, the killer move will turn out so strong, that searching deeper is useless, because if black should allow this move without taking care of the threat, black's going to lose big time. This is called a beta cutoff.
(This is the reason why the Bg2-c6 doesn't work against Qd7-e8. After this move, black can escape from the double attack. He has thus taken care of the threat, and the killer move doesn't work. Try to find out later, how black escapes! If you can't see the solution, look at the bottom of this page.)
This is exactly the killer move heuristic does: it saves the Bg2-c6 move in a "killer move slot", so the search can try it first before other moves, in similar positions at the same ply (in this case, at ply 7 for white). Most engines have two slots. (See the sidenote at the bottom for the reason why.) When ordering the move list, the killer moves get a very high score so they are just below the MVV-LVA captures. If the captures turn out to be no good, the killer moves are the first to be tried, which greatly speeds up the search.
Implementation
Now that we know what a killer move is, let's take a look at the implementation.
Rustic has an instance of a struct called SearchInfo, which contains all the information about the currently running search. The killer move slots are part of this struct:
type KillerMoves = [[ShortMove; MAX_DEPTH as usize]; MAX_KILLER_MOVES];
pub struct SearchInfo {
some stuff here...
pub killer_moves: KillerMoves,
and so on...
}
The KillerMoves type is an array, indexed by the number of killer moves (MAX_KILLER_MOVES is 2), and the maximum depth the search can reach in ply (MAX_PLY, which is set at 125 for Alpha 3). This gives us two killer moves at each depth, which can be stored and retrieved by:
let ply = 7;
let k1 = killer_moves[0][ply];
let k2 = killer_moves[1][ply];
The above code retrieves the first and second killer moves for ply 7. See the sidenote at the bottom of this page why we keep two killer moves, instead of one or three or even more.
The killer move is saved like this:
pub fn store_killer_move(current_move: Move, refs: &mut SearchRefs) {
let ply = refs.search_info.ply as usize;
let first_killer = refs.search_info.killer_moves[0][ply];
// First killer must not be the same as the move being stored.
if first_killer.get_move() != current_move.get_move() {
// Shift all the moves one index upward...
for i in (1..MAX_KILLER_MOVES).rev() {
let n = i as usize;
let previous = refs.search_info.killer_moves[n - 1][ply];
refs.search_info.killer_moves[n][ply] = previous;
}
// and add the new killer move in the first spot.
refs.search_info.killer_moves[0][ply] = current_move.to_short_move();
}
}
The function receives the current_move from the alpha-beta search, and the "refs" pointer to the struct that stores all the references to the information the search needs. A reference to the instance of SearchInfo is also in there.
We first get the ply count, and cast it to usize so we can use it to index arrays. Then we get the first killer, just as was shown before in the short code snippet. The if-statement is the part where we make sure the two killers are unique: if the first killer move and the one we are inserting are different, then both moves will always be different.
We loop through the list of killer moves (granted: it's a short loop), and we shift all of them one index upward: so the move on spot 0 is copied to spot 1, and the move on spot 1 is gone. (If we did have three killer moves, this one would go into spot 2.) Then the new killer move is inserted into spot 0 of the killer_moves array.
Ordering moves is a little bit more complicated than ordering on MVV-LVA. The score_move function needs to be extended, to take the killer moves into account. This is the score_moves function, with killer move ordering added:
const MVV_LVA_OFFSET: u32 = u32::MAX - 256;
const KILLER_VALUE: u32 = 10;
pub fn score_moves(ml: &mut MoveList, tt_move: ShortMove, refs: &SearchRefs) {
for i in 0..ml.len() {
let m = ml.get_mut_move(i);
let mut value: u32 = 0;
if m.captured() != Pieces::NONE {
value = MVV_LVA_OFFSET + MVV_LVA[m.captured()][m.piece()] as u32;
} else {
let ply = refs.search_info.ply as usize;
let mut i = 0;
while i < MAX_KILLER_MOVES && value == 0 {
let killer = refs.search_info.killer_moves[i][ply];
if m.get_move() == killer.get_move() {
value = MVV_LVA_OFFSET - ((i as u32 + 1) * KILLER_VALUE);
}
i += 1;
}
}
m.set_sort_score(value);
}
}
There are some new parts in the function. At first, we have these two new constants:
const MVV_LVA_OFFSET: u32 = u32::MAX - 256;
const KILLER_VALUE: u32 = 10;
Ordering the moves is done in this way:
- The transposition table move comes first (to be discussed)
- Then the captures using MVV-LVA
- After that, the two killer moves
- And below the killer moves are the moves sorted by the history heuristic (to be discussed)
To make this work, we use the MVV_LVA_OFFSET. This value is the maximum a 32-bit integer can hold, minus 256. So, we have 256 values available, 'above' MVV_LVA_OFFSET. Now you can see why in the MVV-LVA chapter, I said that I like to keep values as low as possible. As the highest value in the MVV-LVA table is 55, it will fit comfortably in the 256 values which are available above MVV_LVA_OFFSET.
First, we iterate through the move list, so we can order all the moves. This is the first for-loop:
for i in 0..ml.len() { ... }
Inside the loop we now need to distinguish between captures and non-captures. This is what the if-statement does.
if m.captured() != Pieces::NONE { ... }
If the move is a capture, it will be ordered using the MVV-LVA table just like before (see the chapter about MVV-LVA), but now, it shifts everything upward by the MVV_LVA_OFFSET.
So, we get space (values) available 'below' MVV_LVA_OFFSET.
If the move is not a capture, we try to find it in the list of killer moves. We just loop through the list as long as we are not at the end, and as long as the sort value is still 0.
while i < MAX_KILLER_MOVES && value == 0 { ... }
(As long as the sort value is 0, the move is not yet ordered.) As soon as we match the move in the move list against a killer move, its sort value is calculated, using both MVV_LVA_OFFSET, and KILLER_VALUE (which is set to 10).
value = MVV_LVA_OFFSET - ((i as u32 + 1) * KILLER_VALUE);
If we match the first killer at spot 0, the value will be:
value = MVV_LVA_OFFSET - ((0 + 1) * 10)
value = MVV_LVA_OFFSET - (1 * 10)
value = MVV_LVA_OFFSET - 10
Pay attention to the fact that we are sorting killers below MVV_LVA_OFFSET this time, so the killers will be a lower value than any capture. The first killer will be 10 points below MVV_LVA_OFFSET, the second killer will be 20 points below. (I you should manage to make 3 or 4 killers work, they would be 30 and 40 points below respectively.)
Now the killers are ordered below the captures. Killer move ordering is now done, and alpha-beta will use pick_move() to pick a killer before an unordered move.
Sidenote: number of killer moves We keep two killer moves because we don't keep one or three moves. Well, maybe that's not an answer :) The reason why we don't keep three, four or 20 killer moves per ply is because they work best if each killer move is unique: so no slots have the same move. Making sure that two killer moves are unique is very easy and fast, as you saw in the implementation above.
Experience and testing have shown that two unique killer moves per ply are better than only one killer move per ply, and that trying to keep three or more killer moves unique from one another costs more time than the killer move feature saves. Therefore, in most engines, keeping more than two killer moves unique is detrimental to the engine's speed. With the implementation above, you could try to keep three killer moves for each ply, hoping that some of the time there will be no duplicates. In some engines this can work, but you will have to try and test it.
Puzzle solution
Here is the position from the start again:

White plays f3-f4, setting up the double attack threat Bg2-c6, which can be used as a killer move after various black moves. Question: Why does Bg2-c6 not work anymore as a killer move, when black plays Qd7-e8?
Solution After Qd7-e8, Bg2-c6 doesn't work as a killer move because black can escape the double attack by playing Qe8-e3+. White has to get his king out of check, for example Kg1-h1, and then black can move the rook on b5 out of danger.
Warning: Qe8-e3+ is quite a sneaky move. If white tries to to trade the queens by blocking the check with Qb2-f2, the rook on c1 becomes undefended and the black queen can capture it with another check!
TT-move ordering
(Implemented since Rustic Alpha 2.)
TT-move ordering is a technique where a move returned from the transposition table is ordered first in the list, in front of all other moves. The hard part is implementing the TT itself; ordering on the TT-move is very easy.
You should have already implemented the TT to be able to add the TT-move ordering. If you haven't, it's recommended to take a look at the Transposition table chapter. It explains what the TT is exactly, how it works, and how it can be implemented. Then add create the TT first. In that chapter it's also explained that the TT can store moves which cause a beta cutoff, or are part of the principal variation.
It should be very obvious by now why you want to try either of those moves ASAP. If a move causes a beta cutoff, you want to try it as soon as possible, because it saves you having to search large parts of the tree. If a move is a PV-move, you also want to try it as soon as possible, as it makes you find your best move faster. Again, the result is that you have to search less.
This is the alpha/beta function with all the parts removed that have nothing to do with TT-move ordering:
impl Search {
pub fn alpha_beta( ... ) -> i16 {
// ...
// Variables to hold TT value and move if any.
let mut tt_value: Option<i16> = None;
let mut tt_move: ShortMove = ShortMove::new(0);
// Probe the TT for information.
if refs.tt_enabled {
if let Some(data) = refs
.tt
.lock()
.expect(ErrFatal::LOCK)
.probe(refs.board.game_state.zobrist_key)
{
let tt_result = data.get(depth, refs.search_info.ply, alpha, beta);
tt_value = tt_result.0;
tt_move = tt_result.1;
}
}
// ...
// Generate the moves in this position
let mut move_list = MoveList::new();
refs.mg
.generate_moves(refs.board, &mut move_list, MoveType::All);
// Do move scoring, so the best move will be searched first.
Search::score_moves(&mut move_list, tt_move, refs);
// ...
// Iterate over the moves.
for i in 0..move_list.len() {
Search::pick_move(&mut move_list, i);
let current_move = move_list.get_move(i);
// ...
}
// ...
}
}
We have two variables, tt_value and tt_move which contain the value and the move coming from the TT, if there are any. Assume we have a TT-move in the tt_move variable. We generate all moves as we normally do, and then call score_moves(). We pass the tt_move variable into score_moves(). We have seen this function before, but now it has been extended to take the TT-move into account:
const TTMOVE_SORT_VALUE: u32 = 60;
pub fn score_moves(ml: &mut MoveList, tt_move: ShortMove, refs: &SearchRefs) {
for i in 0..ml.len() {
let m = ml.get_mut_move(i);
let mut value: u32 = 0;
// Sort order priority is: TT Move first, then captures, then
// quiet moves that are in the list of killer moves.
if m.get_move() == tt_move.get_move() {
value = MVV_LVA_OFFSET + TTMOVE_SORT_VALUE;
} else if m.captured() != Pieces::NONE {
// Order captures higher than MVV_LVA_OFFSET
value = MVV_LVA_OFFSET + MVV_LVA[m.captured()][m.piece()] as u32;
} else {
let ply = refs.search_info.ply as usize;
let mut n = 0;
while n < MAX_KILLER_MOVES && value == 0 {
let killer = refs.search_info.killer_moves[ply][n];
if m.get_move() == killer.get_move() {
// Order killers below MVV_LVA_OFFSET
value = MVV_LVA_OFFSET - ((i as u32 + 1) * KILLER_VALUE);
}
n += 1;
}
}
m.set_sort_score(value);
}
}
The constant TTMOVE_SORT_VALUE was added, which is higher than the highest value from the MVV-LVA table. (See the chapter about MVV-LVA move ordering, should you have forgotten this.) We then iterate through all the moves, but now we first take a look if the move we are at in the list, is the TT-Move which was just passed into the function. If so, we assign MVV_LVA_OFFSET + TTMOVE_SORT_VALUE to "value", and the TT-move will therefore be ordered higher than any move ordered by MVV-LVA. As a result, this move will be put first by the pick_move() function.
That's all, folks. Told you this was an easy one 😊 (After you implement the TT itself that is, which isn't that easy at all 😱)
Sidenote: what about ordering on the PV-move? There is a technique called PV-move ordering, which orders the best move from the previous iteration in the first spot. Ordering the move works the same was is with the TT-move; the only difference is that you pass the PV-move to the score_move() function instead of the TT-move. This is easier to implement, because you don't need a TT for it. As the TT stores the PV-moves (and the cut-moves), PV-move ordering is inherently built into TT-move ordering. If you have a TT, PV-move ordering becomes superfluous.
There is a chance your the TT-entry holding the PV-move for the position you are in was overwritten with a different entry so you have no PV-move to order on. As far as I know, many engines take this risk for granted and don't implement PV-move ordering is a fallback. It's probably not worth it with regard to Elo gain. Rustic does not implement PV-move ordering.
Evaluation
Understanding evaluation
If we could solve chess perfectly, we would have to know only one of three things: are we checkmated, is the opponent checkmated, or is the game a draw? To be able to determine this, we would need to look through all possible combinations of moves, to reach all possible ending positions, so we can see if either we or the opponent are checkmated, or if the position cannot be won by any player, we have thus drawn the game.
This is obviously not feasible. The number of chess positions are bigger than all the atoms in the universe. It would take until the end of time (and beyond) to calculate all the possible paths to all the possible ending positions.
We do have opening books with some very long lines in them, and endgame tablebases containing all endgames with up to 7 pieces (even though they require a huge amount of storage space). So, the beginning and ending of the game are coming closer together. The middle game, the part just after the opening and before the beginning of the endgame database is still vast enough that this part of the game cannot be bridged by any computer or program (yet).
Because we cannot see the end of the game, we have to try and win by taking small steps, move by move. We therefore have to determine if a move is good. Human chess players use many criteria to determine if a move is good.
If I play this move:
- Will it check the opponent?
- Can I win material with it?
- Do I lose material because of it?
- Does it improve my position?
- Does it make my king (un)safe?
And so on.
We try to come up with an idea, such as: "I want to get my knight to d5", or "I want to open the f-file for an attack." Then we run through the moves in our head to reach the position where the knight is on d5, or the f-file is open. In humans, this is called "thinking ahead"; in computer chess, this is called "searching." (How to search will be explained in a different chapter.)
After we reach the position, we have to determine if this position is good for us. If you got the knight on d5 and won the d5-pawn in the process, the position may be good for you. If you lost the queen in the process, it's probably bad; and you'll have to look for a different way to get the knight on d5, without losing the queen.
Determining if the position at the end of a search is good or not, is called evaluating the position. A chess engine therefore has an evaluation function which has lots and lots of ways to examine a position.
It asks questions, such as: "Are my pieces on good squares?", "Is my king safe?", "What is the material balance?", "Do I have the bishop pair?" These questions are called evaluation terms. The evaluation function runs through all of the terms. Each of these terms carries a certain weight which is expressed as a value. For example: "I have the bishop pair, so I raise my evaluation by 25 points."
The function comes up with final value in which all the evaluation terms have been considered. This value indicates if a position is good or bad. This called the evaluation score. The evaluation score is always determined from the side which is doing the evaluation: if white is evaluating and white it a pawn up, the position is +100. If it was black to move in the same position then he would be a pawn down, and the evaluation is therefore -100.
This chapter discusses how the engine comes up with this evaluation score.
Material counting
Explanation
Counting the material on the board is the most basic way of evaluating a chess position. You can just compare material: "I have one bishop and 2 knights, while my opponent has two bishops and one knight. So light pieces are equal. We both have a queen, and one rook, so that is equal as well. He has four pawns and I have five... so in the end, I'm one pawn up." That is basically it, with counting material.
Somewhat more complicated is the relationships between pieces, if material is not equally balanced. How much pawns is one knight worth? How much bishops + pawns is a rook worth? What do I need to get back for giving up my queen (if it's not the queen of the opponent)? Humans often use this:
- Queen: 9 points
- Rook: 5 points
- Bishop: 3 points
- Knights: 3 points
- Pawn: 1 points
So a knight is worth 3 pawns. A Rook is worth one bishop and two pawns. Two rooks are worth slightly more than a queen. Three light pieces are exactly worth a queen too. A queen is also equivalent to a rook + knight + pawn, and so on.
Some people prefer the values below. For example, Max Euwe (World Champion 1935-1937), who was a well-known chess educator in the middle part of the 20th century:
- Queen: 9.5 points
- Rook: 4.5 points
- Bishop: 3 points
- Knights: 3 points
- Pawn: 1 points
So a bishop and knight are still worth three pawns, but Euwe states that a rook cannot be exactly compensated. A knight + pawn is just a tiny bit too little, while a knight + two pawns is a tiny bit too much. Two rooks fall just short of a queen, but rook + pawn is a bit over; same with 3 light pieces versus a queen.
Which is the best setup? It completely depends on your playing style. I've tested both setups with Rustic Alpha 1 and 2, and with my own hand-written PST's (Piece-Square Tables), the first setup is the better. Versions of Rustic with the first setup are about 30-40 Elo stronger than the ones with the second setup. If I had written my PST's differently, it could have been the other way around, so in your engine, try both.
For more information about PST's, see the chapter about Piece-Square tables. Also take into account that later, the material values will be worked directly into the Piece-Square tables, and these will then be automatically tuned. This is going to be discussed in the "Tapering and Tuning" chapter.
Implementation
Implementation difficulty of counting the material balance is in the easy to lower medium range. We will be discussing two methods of calculating the material balance: counting from scratch, and keeping the value updated incrementally. (Rustic uses the second approach.)
Value per piece type
Rustic has a struct holding all the piece types in its Board definitions:
pub struct Pieces;
impl Pieces {
pub const KING: Piece = 0;
pub const QUEEN: Piece = 1;
pub const ROOK: Piece = 2;
pub const BISHOP: Piece = 3;
pub const KNIGHT: Piece = 4;
pub const PAWN: Piece = 5;
pub const NONE: Piece = 6;
}
Using this, we can create an array which assigns a numerical value to each piece. These are implementations of the values from above:
pub const PIECE_VALUES: [u16; 6] = [0, 900, 500, 320, 310, 100];
// --- or this one ---
pub const PIECE_VALUES: [u16; 6] = [0, 980, 475, 320, 310, 100];
So, this table can be indexed by:
let v = PIECE_VALUES[Pieces::Queen]; // v = 900 here.
let v = PIECE_VALUES[Pieces::Pawn]; // v = 100 here.
There are two differences with the values we saw earlier in the chapter.
First, the evaluation is MUCH faster if we only use integers. If at all possible, we will never use floating point numbers while calculating anything. (Except, maybe, in the very last step before outputting something.) This is the reason why we have multiplied all the values by one hundred. This gives the evaluation a way to think in terms of "parts of a pawn", by using values such as 50 or 10, instead of 0.5 or 0.1.
Second, in most chess engines, it turns out to be better to have the bishop be worth slightly more than the knight, and that a light piece is a bit better to have than three pawns. This causes the engine to not trade a bishop for a knight (all else being equal), and it'll keep the light piece instead of trading it for three pawns.
Again, don't fret too much about these values because in later stages of engine development, they will be merged with the Piece-Square Tables and automatically tuned.
Counting vs. Updating
Counting from scratch
There are two ways you could know the material balance when evaluating the position: you can count the material at that point, or you update the value incrementally.
Counting the material balance at the point where you need to know it is very straightforward. Below is a piece of pseudo-code. This doesn't come from Rustic (and it isn't even Rust), because the engine doesn't do it this way. This implementation is given for completeness' sake, because the difference of "counting vs. updating" will crop up again later, when implementing the PST's. We can then skip this example.
Here's the pseudo-code to count the material balance for use in the evaluation function:
function count_material(position) {
let value_white = 0;
let value_black = 0;
// For example, piece-type == knight
for each piece_type in all_piece_types {
// For all knights on the board
for each piece in piece_type {
if piece is a white piece {
value_white += PIECE_VALUES[piece];
} else if piece is a black piece {
value_black += PIECE_VALUES[piece];
}
}
}
let balance = value_white - value_black;
return balance;
}
That is basically it. Run through all the pieces, and add the piece's value to either the white or black counter. Then subtract black from white, and you have the balance: if positive, white is ahead, if negative, black is ahead. If material balance is exactly equal, then "balance" will obviously be zero.
Incrementally updating
Counting from scratch works perfectly well, but it is slow: each time the engine evaluates a position, it has to calculate the material balance all over again. At most it only has to add 2x 16 values and then subtract the results, but think about the fact that this has to be done millions and millions of times a second during a search run.
When playing a game of chess yourself, you're not going to re-count the pieces from scratch each time the position changes. At the beginning of the game you know the material balance is equal. When you move a piece and there is an exchange, you can then adjust the material balance to the new situation.
This is exactly what Rustic Alpha 1 and 2 do, and it's called an incremental update. To achieve this, Rustic keeps the material in a game state struct, along with some other things it needs to know throughout the game:
pub struct GameState {
pub active_color: u8,
pub castling: u8,
pub halfmove_clock: u8,
pub en_passant: Option<u8>,
pub fullmove_number: u16,
pub zobrist_key: u64,
pub material: [u16; Sides::BOTH],
pub psqt: [i16; Sides::BOTH],
pub next_move: Move,
}
The "material" member keeps the total material value for each side throughout the game. It is initialized in the Board class, when the engine starts:
fn init(&mut self) {
// ...
let material = material::count(&self);
self.game_state.material[Sides::WHITE] = material.0;
self.game_state.material[Sides::BLACK] = material.1;
// ...
}
The function counting the material is an implementation of the pseudo-code given before, in Rust:
pub fn count(board: &Board) -> (u16, u16) {
// Material values for white and black.
let mut white_material: u16 = 0;
let mut black_material: u16 = 0;
// Make a shorthand for the white and black piece bitboards
let bb_w = board.bb_pieces[Sides::WHITE];
let bb_b = board.bb_pieces[Sides::BLACK];
// Iterate over white and black bitboards by piece_type
for (piece_type, (w, b)) in bb_w.iter().zip(bb_b.iter()).enumerate() {
// Get white and black pieces of the type "piece"
let mut white_pieces = *w;
let mut black_pieces = *b;
// Add material values for white (for each piece of this piece type)
while white_pieces > 0 {
white_material += PIECE_VALUES[piece_type];
bits::next(&mut white_pieces);
}
// Add material values for black (for each piece of this piece type)
while black_pieces > 0 {
black_material += PIECE_VALUES[piece_type];
bits::next(&mut black_pieces);
}
}
// Return values
(white_material, black_material)
}
After starting, the engine will know the total material value for both white and black. As soon as a piece is removed from a square (by capture, or by moving it) or put down on a square the material value is updated:
pub fn remove_piece(&mut self, side: Side, piece: Piece, square: Square) {
// ...
self.game_state.material[side] -= PIECE_VALUES[piece];
// ...
}
pub fn put_piece(&mut self, side: Side, piece: Piece, square: Square) {
// ...
self.game_state.material[side] += PIECE_VALUES[piece];
// ...
}
Note that this will even take promotions into account: remove a pawn from the 7th rank and thus subtract the knight's value from the material count, and then put a queen down, adding the queen's value back to the material count.
Obviously, the remove_piece() and put_piece() functions also remove and put the piece, and they keep other incremental scores next to the material, such as as the incrementally updated PST's. That code has been omitted.
With this implementation, the engine knows the white and black piece value count throughout the entire game, and the evaluation function only has to do one thing to use that as a basis:
pub fn evaluate_position(board: &Board) -> i16 {
// ...
let w_material = board.game_state.material[Sides::WHITE] as i16;
let b_material = board.game_state.material[Sides::BLACK] as i16;
// Base evaluation, by counting material.
let mut value = w_material - b_material;
// ...
// Return evaluation value.
value
}
As you can see, the re-counting of the piece value from scratch has been reduced to just getting the white and black values from the board and then subtracting them.
This exact same technique will also be used to implement the Piece-Square tables incrementally, which will be the next chapter. It is exactly the same, only there are 6 arrays instead of 2 (one for each piece type instead of each side), and 64 integers per array instead of 6 (one for each square instead of for each piece type).
Now that we know how we can keep the material count, we can move on to the next part of the evaluation: Piece-Square Tables. These will add a little bit of positional knowledge to the chess engine on top of the material count, so it will have an idea of what to do with a piece.
Piece-Square Tables
Explanation
Piece-Square Tables (henceforth PST, often also called PSQT) are the most fundamental part of an engine's evaluation function. Without PST's it's very hard to get the engine to play a decent game of chess. They have been present since the very first version of the engine. After counting material, this is the first evaluation function to implement.
PST's are exactly what the name implies: they are tables that indicate which piece goes where. A PST's value indicates if a square is a good square for a piece, or it isn't. Because you only have an empty board and the piece itself available to evaluate if a square is good or not, you can only look at the piece's mobility. On top of that, you can encode a tiny bit of positional knowledge into the PST's.
Let's take a look at how it's done. This is a PST for a rook, from white's point of view (a1 is on the lower left, to make the table easy to read and edit):
const ROOK_MG: Psqt = [
0, 0, 0, 0, 0, 0, 0, 0,
15, 15, 15, 20, 20, 15, 15, 15,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 10, 10, 10, 0, 0
];
The rook has 14 squares available to move to, from any square on the board. Given only that criterion, it doesn't matter where the rook is. It always has 14 squares available on an empty board. Therefore, most of the values in the table are 0.
Now we also see a tiny bit of the mentioned positional knowledge: we know that the rook is strong on the 7th rank, so all squares on that rank get a small bonus. We also know the rook is often strong on the two middle files, so they get a bonus as well. F1 gets a bonus as an extra encouragement for the king to castle, as the rook is better on f1 than it is on h1.
A somewhat busier PST is the one from the knight:
const KNIGHT_MG: Psqt = [
-20, -10, -10, -10, -10, -10, -10, -20,
-10, -5, -5, -5, -5, -5, -5, -10,
-10, -5, 15, 15, 15, 15, -5, -10,
-10, -5, 15, 15, 15, 15, -5, -10,
-10, -5, 15, 15, 15, 15, -5, -10,
-10, -5, 10, 15, 15, 15, -5, -10,
-10, -5, -5, -5, -5, -5, -5, -10,
-20, 0, -10, -10, -10, -10, 0, -20
];
There are many squares that give negative values for the knight. The closer to the edge and corners it is, the greater the negative impact. In the middle 16 squares, the knight gets a bonus. The reason is that the knight, as opposed to the rook, loses mobility as it is closer to the edge and corners. So, it is better in the middle of the board.
We do not encode any other positional knowledge into the knight's PST; we can't, because what is really the best location for a knight, depends on the placement and interaction of all the other pieces.
We repeat this for all other pieces, so we end up with 6 PST's: one for the king, queen, rook, bishop, knight, and pawn.
Caveats
You may be wondering: but it's not a given that THIS piece always needs to be on THAT square. It changes during the game. In the endgame, the pieces should go on different squares than they were on in the opening and the middle game.
That's true; it's the first caveat. The PST's are only a very rudimentary guideline for the engine. Imagine a beginner, who has just learned the rules. He doesn't have an inkling of where the pieces should be placed. As a general rule, you can tell him: King castled, rooks on the two middle files, knights in the middle, bishops on long distance, targeting the center or the position of the enemy king. The beginner player then has some notion of what he has to do to get the game going. The PST's do the same thing for the chess engine.
In a later chapter we will discuss the so called tapered evaluation, which gives the engine two sets of PST's, one set for the opening/middle-game, and one for the endgame. Then the engine can gradually "glide" from the opening/middle-game PST into the endgame PST as the game progresses. (The values are "tapered", or interpolated, between the opening/middle-game and endgame values.)
After that, we will also write an automatic tuner, which populates the PST's with values that give good results; often better than what you will be able to do by hand. (Not to mention that tweaking 12 PST's by hand is boring and it has to be re-done after you change anything in either the search or the evaluation!)
Tapering and tuning the PST's is a major strength boost for most engines.
(You can see the _MG postfix for "middle-game" in the PST names already, to take into account that the engine is going to have a tapered and tuned evaluation at some point.)
The second caveat is that PST's, by themselves, can't take the dynamics of the game into account. A white knight might generally be great on e5, but there are many reasons to come up with why it would be better on d6, or c4, or any other square, depending on the placement of the other pieces.
For this, we need the evaluation terms that take the dynamics of the game into account, and which modify the evaluation score after the PST's have had their say. The more evaluation terms we have, the less important the PST's become, but together with the material count, they always remain the basis; the starting point to get going.
Implementation
Implementing the PST's is not hard, but it requires a bit of thought.
Communication
Introduction
User-Computer communication through the years
A chess engine is a very simple entity: you give it a position and have it think for some time. When this time is up, the engine returns the best move it has found. We are now faced with the problem of how we are going to get positions and moves into the engine, and how the engine can communicate its reply to us.
The first early commercial chess computers where released in the late 1970's. These computers looked more like a large calculator than a chess board. Indeed, you often had to put in moves in algebraic notation and then wait for the computer to print its response on a display. You had to have a separate normal chess set available to move the pieces, so you could see the position.
In later iterations of this idea the calculator-like interface became smaller, until it could be integrated into a small chess board. This was already starting to look like the chess computers that had their high point in the mid-eighties to the late nineties.
The next step to improve the communication between the player and the chess computer were the so-called sensory boards. You didn't have to type in your moves, but could use the piece you wanted to move to press down on the starting and ending squares. This would let the computer know which move you made. The computer would then indicate the reply, either on a small display, or using LED's to indicate the from and to squares of the piece it wanted to move. You would then go ahead and press those squares for the computer.
The last step would come in the form of some high-end chess computers that were coveted back in the day, and are collector's items now. These computers had reed contacts: you could just move a piece from one square to he other without having to press down on it. In addition, the computer was often hidden inside the board in a pull-out drawer. After you set up the system, you could slide the drawer into the board. The only indication that his was a chess computer would be the LED's on the board which the computer would use to indicate its move. Some of these boards were made out of expensive woods, and the chess computer was built like a module that could be replaced by a newer version, or one with a different program.
In the beginning of the 80's, we also started to see chess programs for personal computers. They started off similar to the tabletop chess computers: in the earliest programs you had to type in moves. In later programs you could click and drag the pieces using a mouse. This is still the standard today.
In the late 1990's, the Dutch company DGT released a large tournament-size chessboard that connects to a computer. You can play moves in the same style as you do on the reed contact tabletop computers: just move a piece, and the move is made. These are the boards you see at top-level tournaments, where the ongoing games are broadcasted over the internet, without someone having to transfer the move from the player's board to an internet-connected computer.
The problem of the early software
In the beginning, resources were limited. Early chess computers came with 2 MHz processors and 1 or 2 kB of useable RAM. In our age, where laptops and desktop computers have speeds of 2000-4500 MHz, and the normal amount of memory is at least 8 GB (8.388.608 kB), the resources available on a computer back in the day seem hilarious. Still, people managed to write chess programs for them, but they had to be very economical about it. Not a single byte could be wasted, lest your program had a few bytes less memory to use, or a few less cpu-cycles available. Tht could make it weaker than competing programs.
The result of this was that the chess engine, and the communication capabilities were wrapped up into a single program. The user interface of the program and the part that calculated the next move, were often intertwined. In a dedicated chess computer that would mean that you'd need to rewrite the user interface part if you wanted to use the chess program in a different computer. On a PC, it would mean that each chess program came with its own user interface. If you liked the user interface of one chess program, but would like to use the playing style of another, you'd be out of luck.
The solution
The obvious solution to this problem is to split the chess program into two parts: the engine, and the user interface / GUI and establish a communication protocol between them. This effort started in the beginning of the 1990's, when XBoard-author Tim Mann devised a way to use XBoard as a GUI for the GNU Chess program. The communication between XBoard and GNU Chess evolved into what we now know as CECP, or the "Chess Engine Communication Protocol". Many people just call it "XBoard" or "WinBoard" (which is the Windows-version of XBoard). In this book and in Rustic, this protocol is also called XBoard.
The XBoard protocol was not designed: it grew as new features were needed. Nowadays, XBoard even supports chess-like games as "variants". Because of this, fully and correctly supporting XBoard can be an involved task.
This led to the creation of a communication protocol that was less work to write. Stefan Meyer-Kahlen, the author of the Shredder chess engine, designed UCI (Universal Chess Interface) in 2000, which was specifically written to allow the chess engine developer to spend as little time as possible on GUI-engine communication. Granted, UCI does not support some things XBoard can, especially in the case of chess variants. (To support those, there are several 'dialects' of UCI, for Korean Chess for example.) Its fast implementation and short specifications did make it the de-facto standard for current-day chess engines.
Even so, both protocols are equally competent regarding standard chess, and choosing one or the other (or both) comes down to philosophical reasons. How you want your chess engine to work is a greater factor in determining the protocol you are going to use than the technical differences. It is fully possible to support both protocols in a chess engine, which is exactly what Rustic does. You obviously have to choose which protocol to use; you can't run the engine with both protocols active at the same time.
Let's first take a look at a high-level description of both protocols, and a comparison between how they expect a chess engine to work.
XBoard vs. UCI
Comparison
Below is a table comparing some of the most important differences between the UCI and XBoard protocols.
XBoard | UCI | |
---|---|---|
Commands | Short | Long |
Commands in | 18 | 7 |
Commands out | 13 | 9 |
Supports features | yes | yes |
Supports openings | yes | yes |
Debug mode | yes | yes |
Messages | yes | yes |
Time management | yes (game) | yes (move) |
Play in terminal | yes | bothersome |
Protocol type | Stateful | Stateless |
Who is the boss? | Engine | GUI |
Can resign | yes | no |
Can offer draw | yes | no |
Can accept draw | yes | no |
Supports variants | yes | dialects |
As you can see int he table above, there are many similarities between the two protocols, but also many differences. The first three rows already indicate a large difference. Rustic implements 7 incoming and 9 outgoing commands for UCI; it implements 18 incoming and 13 outgoing commands for XBoard.
The short commands used by XBoard were an advantage in the past, when parsing commands was done on a character by character basis. Parsing long commands was very tedious and more importantly, very error prone. It was even harder and more tedious to write good error checking routines. However, having more commands also means more stuff to write.
Nowadays many newer programming languages don't require you to parse commands character by character, building tokens as you go. They have splitting functions, iterators, built-in parsing functions (from string to int, for example) including error checks, and so on. Therefore the long commands sent and received by UCI are much less of a hassle with current programming languages.
The other big difference is the fact that XBoard is stateful, where UCI is stateless. An XBoard engine needs to be aware of the state of the game to be able to play. It also needs to be aware of the state the engine is in (observing, waiting, thinking, analyzing). This is not required for UCI, so creating this functionality again requires more code.
In short, UCI is faster to implement by writing less code, as it sends much more information back and forth per command. It also doesn't have to deal with engine and game state. This was one of UCI's design goals. This is the main reason why UCI is now the default protocol for most newer engines.
The difference between "stateless" (UCI) and "stateful" (XBoard) is the one where the philosophical debate is at. I'll try to explain.
Philosophy for XBoard
The XBoard protocol was designed in a time where user interfaces on computers were not as far along developed as they are today. They were very basic: displaying a the board, a clock, the move list, and (maybe) the variation the engine was thinking about. There were two ways of having chess engines play against one another:
- use the user interface as a match/tournament organizer.
- run each engine on its own computer, communicating through a serial interface connecting the two computers.
Because resources and user interface functionality was limited (and sometimes, there even wasn't a user interface involved in an engine vs. engine match), XBoard was designed to make the engine smart.
Smart engine, dumb user interface.
What this means is that the engine functions a lot like a human would.
- At the start of the game, you give it a time control.
- The engine has its own opening book.
- It knows if it is playing black or white.
- It knows it is on move after the opponent has moved.
- It knows all the rules of chess.
- When lost, it can resign.
- It can claim draws by the rules such as three-fold repetition.
- When it thinks the position is equal, it offers a draw.
- When a draw offer comes in, it can be accepted or rejected.
You give both engines the information they need at the start of the game, and then send "go" to the engine playing white. The engine will make a move and send it to the opponent; which can be a human or another engine. If it is another engine, that other engine executes the incoming move. It knows it is expected to move now, so it starts thinking. Depending on the time control (depth, time per move, etc...) it terminates its search at an appropriate moment and returns the best move it found to the first engine.
And that's it: the engines are now playing a game of chess all by themselves until either one is mated or resigns, a draw is claimed by one of them on the basis of the rules, or they agree on a draw.
When using the XBoard protocol, the engine is a chess playing entity.
Philosophy for UCI
Playing chess is not an easy task. There's lots of rules you need to know and many situations can arise. So, you have to keep the state of the game within the engine to know if you exceeded the fifty-move rule, have a three-fold repetition, if you are even on move, if the engine is waiting or thinking, and so on... let alone write routines for loading and searching an opening book, determine if you should resign or not, accept or offer a draw or not, and so on. All of this requires extra code and extra testing.
By the year 2000, computers had progressed enough so that running an advanced user interface that could run matches and tournaments between engines on the same computer (giving the CPU first to one engine, then the other) was completely possible. Also: why should each engine implement code to read an opening book (or endgame tablebases), or verify draws or determine if it should resign? Those could be options in the user interface, where they could be set by the user, instead of leaving them up to the engine authors. (Even though those authors could obviously make those options configurable; but that was not a requirement.)
So the idea was: let's just rip everything that is not move searching from the engine and have this handled by the user interface.
Dumb engine, smart user interface.
UCI was created in 2000.
Because UCI does not require the engine to keep its state, the user interface sends long commands to the engine that make it set up its board from scratch each time a move is made. For example, the GUI sends "position startpos moves e2e4 e7e5 g1f3 b8c6" to the engine playing white, after black played Nb8-c6. The white engine sets up its starting position and plays all the moves after "moves"... and then it waits. Next, the GUI sends a "go" command with the time controls for the next move. Thus, UCI re-sends the game state and time control on each move, over and over again, so the engine does not have to maintain the game state.
It also doesn't have to maintain engine state (idle, thinking, analyzing, etc...) because UCI requires that the user interface keeps track of this and does not send inappropriate commands for the state the engine is in.
As you can imagine, this setup requires a lot less code to write by the chess engine programmer. The disadvantage incurred by this ease of use is that the engine cannot resign anymore, nor can't it offer, accept, or claim draws. These tasks are now done by the user interface.
When using UCI, the engine is a "find the best move in this particular position" program.
Summary
After the UCI-protocol was first introduced there have been heated debates on which of the two is "better": XBoard, which makes the engine behave more like you would expect from a human, or UCI, which is cleaner, faster and somewhat easier to implement. For a few years it was a toss-up between the established XBoard protocol and the UCI newcomer.
Chessbase, the largest supplier of chess software in the world, supported UCI almost immediately with Fritz 7, released in 2001. Before that date Fritz supported alternative engines but they used a proprietary Chessbase protocol. To get a chess engine to run under the Chessbase / Fritz GUI, you had to submit it to Chessbase and hope they would adapt it and offer it for download. As of November 2021, this download page is still available through the WayBack Machine (late 2006 version). The downloads still work, so if you want the old native Chessbase engines, grab them as soon as possible. (I can recommend Turing, which can be a nice match for an average club player.)
Click this link to visit the old download page
This page was taken offline in 2007. In newer versions of Fritz, Chessbase completely switched over to the UCI protocol.
At this time, the most important competitor to Fritz was the ChessMaster GUI. It supported the XBoard protocol for the use of alternative engines, which was a great plus over Fritz's proprietary protocol. Many free XBoard/WinBoard chess engines could be found across the internet that could be used with Chessmaster, but not with Fritz. (Except if you would use an XBoard-to-UCI adapter program, but this is not a user-friendly solution for the user groups Fritz and Chessmaster were targeting.) In the years after 2001 though, this GUI waned in popularity and the last version running The King 3.50 by Johan de Koning, was released in 2007. At the time of writing, this was almost 15 years ago.
UCI gained traction as Fritz 7 and newer got into user's hands while ChessMaster's popularity diminished. Nowadays UCI is the de-facto standard, while most new engines do not implement the XBoard protocol anymore.
Rustic still implements both UCI and XBoard for several reasons:
- I'm a completionist. Most GUI's support XBoard. So why not?
- I like to have and provide options. Maybe some people would like to use Rustic with a GUI which only supports XBoard and don't want to use a UCI-XBoard adapter.
- I wanted to have a go at implementing two different protocols in Rustic without making the core of the engine aware of the fact that it is running in one or the other. (Apart from forwarding incoming commands to the correct handler.)
- I like the thought of the engine being a "chess playing entity" instead of a "find the best move" program.
The two protocols are on par in Rustic, except with regard to reporting some statistics. UCI provides a few more stats. Even though XBoard can also provide those statistics, the command to provide them is not supported by most user interfaces I've tested.
Now that we have a general idea of both protocols and their differences, we can get into designing a communication module for Rustic.
How it works
When you have reached this point in the writing of your engine, you're very close to making it play its first game under a user interface. Assuming you have extensively tested the engine while writing it, making sure there are no bugs in the board representation, the move generator and move handling, then the only thing left is to set up the communication between the engine and user interface.
We just discussed the two main protocols used in the chess engine world: UCI and XBoard. First, we design a way to get commands into the engine and receive replies from it; then we decide if we want to use the UCI or XBoard protocol, or both. That is the most tedious part of the entire engine, because it just comes down to: "If you receive command X, you do action Y and send reply Z."
I'm sure you're eager to get your monstrosity ... awesome creation
... playing against either yourself or other engines, so let's get going.
First we remember the diagram from the concept page:
Both the engine and the user interface are a separate piece of software, running completely independently, so we have to get commands from the user interface into the engine, and replies from the engine into the user interface. This is done through "standard input/output", often called stdin and stdout. You have been working with stdin and stdout for a long time already during the writing of the engine. When you type a command on the command line, you're putting text into stdin; if the command line prints anything, that text comes out through stdout. (Or "standard error", stderr; which could also be something else, such as a printer, but nowadays, stdout and stderr are normally just the computer's display.)
At the very beginning of this journey, the only code you had was probably something like this:
const NAME: &str = "Engine";
const VERSION: &str = "0.1";
fn main() {
println!("{} {}", NAME, VERSION);
}
The output from "println" is sent to stdout, which normally is the screen. If you also have functions to read text typed into the engine (for testing, for example), then this text comes in through stdin, which normally is the keyboard.
As the user interface starts the engine, it uses a construct called a pipe: that is the red two-way arrow in the diagram above. The GUI connects its own stdout on one end of the pipe and stdin from the engine is connected to the other end. Thus, anything the GUI prints to its own stdout goes through the pipe, right into the stdin of the engine, where the engine can then read it. Obviously, the engine's stdout is connected to the GUI's stdin through the same pipe, so any text the engine prints is received by the GUI.
If you have ever used a command line, especially on Linux, you may have seen a pipe and may not have known the name. For example, let's see what files are in a directory:
$ ls
A list of files is printed on screen. Now we want to know if a file called "stuff.txt" is among those files. Instead of reading the entire list, we can do this, by searching for it with grep:
$ ls | grep -i "stuff.txt"
Note the pipe symbol | in between the two commands. This means: connect the stdout from "ls" to the stdin from "grep". This makes grep receive the printed list of files, so it can search through it. (stdout from "grep" is not connected to the stdin from "ls"; it is still connected to the display, or you would never be able to see the result.)
Now that we know how the communication works, we can design a way to set this up. This will be the next chapter.
Design
We finally arrived at the point where we can start designing the last piece of the puzzle to complete the baseline chess engine. For some people, creating the communication may be the first thing they do, or just do somewhere in between, but I prefer to save it for last. First let's take a look again at the architecture diagram from the Design page in the Introduction chapter:
The important parts in the diagram are the Engine, IComm, UCI, and XBoard parts (and to some extend, the Chess Program, but obviously we're not going to change that.) If we split off the unneeded bits in the diagram, we're left only with the bits relevant for this chapter:
The green objects, in this case the Engine and the Chess Program, are distinct and separate programs. IComm is a so-called interface, which is implemented by both UCI and XBoard. UCI and XBoard are active: this means they have threads, separate from the Engine thread, though all three belong to the same program.
Engine is the main thread, which is started as soon as the engine executable starts. The engine has an object which can initialize any module that implements the IComm interface. Depending on its command-line parameters, the engine either initializes UCI or XBoard. In Rustic, both are written in such a way as to instantiate two threads: one for incoming commands from the GUI, one for outgoing replies.
The fun part is: the Engine thread does not even know this. It doesn't know that UCI and XBoard are active threads; it doesn't even know that a different protocol is used. The only two things the engine is aware of are the following:
- I can receive messages from the part instantiated by IComm
- If an incoming message is received, I should put it through the function that handles that type of message. (This is the so-called "handler.")
- The handler instructs calls on the engine to perform certain functions, and sends an outgoing reply if necessary.
That's it. The engine has a receiver for incoming commands and it obtains a sender for outgoing messages as soon as it instantiates IComm. In this way, the engine doesn't need to know what the protocol even is: it only knows how to handle its incoming commands, and what replies it should send. The only part of the entire engine that knows anything about the protocol used is the handler; and each handler only knows one protocol. Visualizing this for UCI would look like this:
This is a fairly complicated diagram which requires some explanation, but in the end it will turn out to be fairly straightforward. First, we have the IComm interface in the lower left. This interface is used to build the structure you see in the diagram. When the IComm object is initialized, it sets up the in/out threads you see in the UCI part, and connects them to the engine's main thread with so called channels.
A channel is a way of communication between two threads. Channels are used to send messages between threads. One thread is the sender, while the other thread is the receiver. If you want two threads to both send and receive information back and forth, you will need two channels. (There probably are ways to do this with only one channel, but I haven't looked into any for Rustic.) If you want some more information about channels right now, you can find it in the Rust Programming Language book.
Let's see what happens if the chess program sends the "uci" command.
In the diagram above, the GUI prints the "uci" command as text to its own stdout. This is connected to the engine's stdin, where the UCI communication module sits waiting for input. It catches the "uci" text and puts it through a parsing function, which turns the command into an enum variant, which also happens to be called "uci". (It could have been called anything else.) Now the "uci" enum is sent to the main thread.
However... the main thread doesn't know anything about incoming commands; uci, xboard, or otherwise. It receives an enum variant called "uci", so it just passes it along to the handler. The handler knows about the "uci" enum and it determines that "someone" wants the engine to identify itself. The handler doesn't know how that "someone" is: a GUI, or a user typing on the command-line; it doesn't care. It just returns the enum variant "identify" to the place where the incoming command came from.
In the UCI Out thread, the "identify" enum is of course known. It means that it should execute the functions "id()", "options()", and "uciok(), which makes the engine print the engine and author name, the UCI options, and the text "uciok" to standard out. The GUI, whose stdin is connected to that stdout, receives the reply and we're done.
All other commands and replies will travel the same path regardless of which protocol they belong to. If the engine had been instantiated using the XBoard protocol, it wouldn't understand UCI commands (because it's the XBoard module running), but if you put in an XBoard command, it will handle it in exactly the same way.
Note that there could have been an extra entry at the right of the above sequence diagram: Actions. If a command comes in, it could be that the engine needs to perform some action, such as setting up the board. The handler will initiate this action and wait for the result. If required it will then send a reply based on that result.
In the case of the "uci" command the engine doesn't need to perform any actions, so this part has been left out. If we had described the "position" command, the engine would have set up the incoming position using an action, but it would not have sent back a reply because "position" doesn't require a reply.
If you have been watching carefully, you may have noticed that the engine only receives input from the outside through an instance of IComm (in this case, UCI or XBoard), and it only sends replies through IComm. It never receives or prints things from anywhere else. This is essential of keeping the engine itself clean of any protocol handling. If you use a threaded infrastructure such as this one, there is no need to read commands and print replies throughout the engine. This will keep the main engine and the protocol handling separated, which will make it easy to add new protocols in the future, should this be required.
The next topic will be on how to implement this threaded setup in Rust.
Implementing IComm
Implementing commands
Engine
Introduction
Progress
Playing Strength
As mentioned in the preface of this book, writing a chess engine can be a rewarding endeavour. It takes a long time to write a very strong engine, especially if you are just starting out. It takes some knowledge about programming and chess, lots of time, and often, perseverance, to get the first basic version up and running. After this version works and plays legal and somewhat decent chess, it can be improved incrementally. If done correctly, each new feature will add playing strength.
This is no different with Rustic. The first version, Alpha 1, is the baseline version. It only has the minimal amount of features to play legal, but decent chess:
- The board representation (Bitboards)
- Move generator (Fancy Magic Bitboards)
- Make/Unmake move on the board
- Alpha-Beta search
- Quiescence search
- Check extension
- MVV-LVA move sorting
- Evaluation (Material counting and PST's)
- UCI communication protocol
All other versions build on top of the previous version. The table below provides an overview of the added features per version and the gain in playing strength they provide. The strength gain is measured by playing the new Rustic version against the previous version. Please note that the results obtained in my tests will be different compared to your own, because each engine is different. Also, adding features in a different order will give different results.
Also take into account that results by self-play are inflated. Because one engine has a feature the other doesn't have, with that feature being the only difference, the newer engine will (ab)use this feature constantly. In the end the real increase in playing strength can only be measured in large tournaments. Self-play is used to prove that the newer engine is stronger than the previous version, not to obtain a rating. As a rule of thumb, at least for Rustic, it seems that 60% of the rating improvement obtained in self-play will 'stick' when running in a tournament.
Side note: A notable difference seems to be the tapered and tuned evaluation, and the refactors and optimizations done thereafter. They provide an improvement of 300 Elo in self-play, and the entire gain seems to carry over into gauntlets and tournaments, instead of the expected 60%. Hopefully the improvement will also carry over when playing more different engines for the CCRL rating list.
In the table below, we start by writing the baseline version, which is then known as version Alpha 1. CCRL tested this version with a result of 1675 ELo in their Blitz list.
The feature "TT cuts only" was built on top of Alpha 1. The rating increase in self-play against Alpha 1 was +42 Elo. Then the "TT Move Ordering" feature was built on top of the "TT Cuts Only" version, and this gained another +100 Elo in self-play. This completes the transposition table. This version became Alpha 2, which was tested in the CCRL list at 1815 Elo. Then "Killer moves" were built on top of Alpha 2... and so on.
Progress per feature and version
Version | Feature | Improvement | CCRL |
---|---|---|---|
Writing baseline... | |||
Alpha 1 | Baseline | 1675 | |
TT cuts only | 42 | ||
TT Move sorting | 103 | ||
Alpha 2 | 1815 | ||
Killer Moves | 56 | ||
PVS | 55 | ||
Alpha 3.0.0 | 1865 | ||
Alpha 3.1.112 | |||
Tapered & tuned eval | 248 | ||
Refactor / Optimize / Fix | 52 | ||
Later... | |||
Aspiration Window | ? | ||
History Heuristic | ? | ||
Null move pruning | ? | ||
... ? | ? |
Determining progression in actual playing strength
It is impossible to define the strength of a chess engine, or a human player for that matter, by an exact number. This is because of how the Elo-rating system works. The rating system works with a pool of players, and it determines their relative strength, from one player to another. Not every player can play every opening or time control equally well. It also happens that a certain player A consistently performs better against B than expected, but also consistently plays worse than expected against player C.
Chess engine testing is also not transitive, especially if the tested engines are fairly close in strength. This means that if engine A almost always wins matches against engine B, while engine B consistently wins matches against engine C, it is NOT the case that engine A will for certain consistently win against engine C.
If you test engines in your own gauntlets, it is therefore impossible and unadvisable to compare the results of your gauntlet against something like the CCRL rating list. The time control is different (and even if it's exactly the same, your computer is different), the opening books are different, and you are likely choosing different opponents. Your pool of engines to test is also smaller. All these factors will cause your gauntlet to have different ratings for each engine, including your own.
The only thing you can use your gauntlet for is to determine if "new engine" is stronger than "old engine", if both "new" and "old" play the same opponents, under the same time controls, with the same opening book, on the same computer, and with the same settings. The one conjecture you can make is that if "new" is 100 Elo stronger than "old" in your own gauntlets, then "new" will probably (but not certainly) be roughly 100 Elo stronger than "old" in a CCRL test. But they might accidentally choose an opponent against which your engine fares exceptionally well or very badly, while you did not have that opponent in your pool. This will lead to a higher or lower rating than expected.
SPRT testing
One of the best ways to test one engine against another is by using SPRT, which stands for "Sequential Probability Ratio Test." A program which can run such matches is cutechess-cli, by using the SPRT parameter.
It is beyond the scope of this book to explain all the mathematics behind these formulas, even if I'd understand everything in detail. I'm a programmer, not a statistician.
It is sufficient to understand the basics.
You run a match between two engines. This match does not have a set length, but to make sure it doesn't run 'forever', we normally set a very large number of games, such as 20.000. To make SPRT work, we need to set two hypotheses: one we hope that comes true (H1), and one we hope that isn't true (H0, the NULL hypothesis). We allow for a chance of 5% that the SPRT-test gives us the wrong result. So, we are 95% confident that the result is correct. (You can lower the 5% margin, but the test will run for a very long time. I deem 5% to still be practical; it is also the value used by many other engine authors.)
Let's say we have a new engine, called version NEW. We also have an old version, we call version OLD. We want to know if NEW is stronger than OLD, and by how much Elo.
Now we state:
- H0: Engine NEW is at least 0 Elo stronger than OLD.
- H1: Engine NEW is stronger than OLD outside an error margin of 10 Elo.
- We set confidence level at 95%, so there's a 5% chance of getting the wrong result from the test.
When running cutechess_cli, we give it the SPRT parameter:
-sprt elo0=0 elo1=10 alpha=0.05 beta=0.05
A cutechess_cli command could look like this:
cutechess-cli \
-engine conf="Rustic Alpha 3.15.100" \
-engine conf="Rustic Alpha 3.1.112" \
-each \
tc=inf/10+0.1 \
book="/home/marcel/Chess/OpeningBooks/gm1950.bin" \
bookdepth=4 \
-games 2 -rounds 2500 -repeat 2 -maxmoves 200 \
-sprt elo0=0 elo1=10 alpha=0.05 beta=0.05 \
-concurrency 4 \
-ratinginterval 10 \
-pgnout "/home/marcel/Chess/sprt.pgn"
With this command, we're testing 3.15.100 (NEW) against 3.1.112 (OLD), where we run 2500 rounds with 2 games each, so each engine plays the same opening with white and black. Time control is 10 seconds + 0.1 increment.
Now we start the match between our NEW and OLD engine version, and cutechess_cli will start to play games.
As long as the difference between NEW and OLD is between 0 and 10 Elo, the SPRT-test keeps running, because it cannot be determined which hypothesis is true. If NEW is 7 Elo stronger, then NEW is at least 0 Elo stronger than OLD, but it isn't outside the 10 Elo error margin yet. The test could still go either way. If this stays the case up to and including the maximum number of games to play, then cutechess_cli will abort without accepting either hypothesis, and the test is inconclusive.
On the other hand, if NEW is +100 Elo stronger than OLD after only 400 games, that is clearly outside the 10 Elo margin required for H1 to be accepted. As soon as cutechess_cli is 95% certain that the result will stay outside the error margin, the test will be aborted, and H1 accepted.
Likewise, if NEW is -50 Elo, then this is clearly weaker than OLD, and the error margin is outside the stated 10 Elo. As soon as cutechess_cli is 95% certain that the test will stay outside the error margin, the test is aborted and H0 is accepted.
If you increase elo1, the error margin becomes larger, and thus cutechess will be faster to terminate the test. Your self-play Elo will become much less accurate tough. Let's say you set this:
-sprt elo0=0 elo1=50 alpha=0.05 beta=0.05
If NEW is 60 Elo stronger, and the error margin is 50 Elo, cutechess_cli may accept that the engine is stronger, because 60 is outside the 50 Elo margin. Even though NEW is stronger, the strength range is 10-110 Elo: NEW can be anywhere between 10 and 110 Elo stronger, which is a very inaccurate determination. If you had set elo1 to 10, the range would have been 50-70 Elo. If you set elo1 to 2 Elo, the range would be 58-62 Elo, but that will take a very long time to test. I've found 10 Elo to be a fair setting, at least on my (current) i7-6700K running 4 games concurrently at 10s+0.1s.
SPRT Results
Below are the SPRT results between the different Rustic versions, for people who are interested in the specific results. The list starts from the oldest version at the top, down to the latest version.
TT-cuts
Score of Rustic Alpha 1.5 vs Rustic Alpha 1.1: 869 - 646 - 357 [0.560] 1872
... Rustic Alpha 1.5 playing White: 431 - 321 - 184 [0.559] 936
... Rustic Alpha 1.5 playing Black: 438 - 325 - 173 [0.560] 936
... White vs Black: 756 - 759 - 357 [0.499] 1872
Elo difference: 41.6 +/- 14.2, LOS: 100.0 %, DrawRatio: 19.1 %
SPRT: llr 2.95 (100.3%), lbound -2.94, ubound 2.94 - H1 was accepted
TT-move ordering
Score of Rustic Alpha 2 vs Rustic Alpha 1.5: 409 - 197 - 133 [0.643] 739
... Rustic Alpha 2 playing White: 218 - 84 - 68 [0.681] 370
... Rustic Alpha 2 playing Black: 191 - 113 - 65 [0.606] 369
... White vs Black: 331 - 275 - 133 [0.538] 739
Elo difference: 102.5 +/- 23.5, LOS: 100.0 %, DrawRatio: 18.0 %
SPRT: llr 2.95 (100.1%), lbound -2.94, ubound 2.94 - H1 was accepted
Killer Move Heuristic
Score of Rustic Alpha 2.1.100 vs Rustic Alpha 2: 628 - 416 - 277 [0.580] 1321
... Rustic Alpha 2.1.100 playing White: 357 - 167 - 137 [0.644] 661
... Rustic Alpha 2.1.100 playing Black: 271 - 249 - 140 [0.517] 660
... White vs Black: 606 - 438 - 277 [0.564] 1321
Elo difference: 56.2 +/- 16.8, LOS: 100.0 %, DrawRatio: 21.0 %
SPRT: llr 2.94 (100.0%), lbound -2.94, ubound 2.94 - H1 was accepted
Principal Variation Search
Score of Rustic Alpha 2.2.100 vs Rustic Alpha 2.1.100: 591 - 388 - 318 [0.578] 1297
... Rustic Alpha 2.2.100 playing White: 334 - 171 - 143 [0.626] 648
... Rustic Alpha 2.2.100 playing Black: 257 - 217 - 175 [0.531] 649
... White vs Black: 551 - 428 - 318 [0.547] 1297
Elo difference: 54.8 +/- 16.6, LOS: 100.0 %, DrawRatio: 24.5 %
SPRT: llr 2.95 (100.3%), lbound -2.94, ubound 2.94 - H1 was accepted
Killer Move heuristic + PVS
Score of Rustic Alpha 2.2.100 vs Rustic Alpha 2: 423 - 218 - 174 [0.626] 815
... Rustic Alpha 2.2.100 playing White: 221 - 106 - 80 [0.641] 407
... Rustic Alpha 2.2.100 playing Black: 202 - 112 - 94 [0.610] 408
... White vs Black: 333 - 308 - 174 [0.515] 815
Elo difference: 89.3 +/- 21.7, LOS: 100.0 %, DrawRatio: 21.3 %
SPRT: llr 2.96 (100.4%), lbound -2.94, ubound 2.94 - H1 was accepted
Tapered Evaluation (3.1.112)
Score of Rustic Alpha 3.1.112 vs Rustic Alpha 3.0.0: 105 - 19 - 16 [0.807] 140
... Rustic Alpha 3.1.112 playing White: 51 - 9 - 11 [0.796] 71
... Rustic Alpha 3.1.112 playing Black: 54 - 10 - 5 [0.819] 69
... White vs Black: 61 - 63 - 16 [0.493] 140
Elo difference: 248.7 +/- 67.6, LOS: 100.0 %, DrawRatio: 11.4 %
SPRT: llr 2.97 (101.0%), lbound -2.94, ubound 2.94 - H1 was accepted
Finished match
Refactoring/optimization (3.16.100)
Score of Rustic Alpha 3.15.100 vs Rustic Alpha 3.1.112: 224 - 145 - 160 [0.575] 529
... Rustic Alpha 3.15.100 playing White: 125 - 60 - 80 [0.623] 265
... Rustic Alpha 3.15.100 playing Black: 99 - 85 - 80 [0.527] 264
... White vs Black: 210 - 159 - 160 [0.548] 529
Elo difference: 52.3 +/- 24.9, LOS: 100.0 %, DrawRatio: 30.2 %
SPRT: llr 2.98 (101.2%), lbound -2.94, ubound 2.94 - H1 was accepted
Finished match
Changelog
Series 4
Rustic 4.0.0 (TBA)
Currently in development.
Series Alpha 3
March 28, 2023 - Rustic Alpha 3.0.3
Maintenance upgrades. There is no functional difference to the previous versions. For normal playing and testing, the existing binaries can be used.
- Update About banner layout
- Upgrade 'rand_core' to 0.6.4
- Upgrade 'clap' to 4.1.14
- Upgrade 'crossbeam-channel' to 0.5.7
- Upgrade 'crossbeam-utils' to 0.8.15
June 11, 2022 - Rustic Alpha 3.0.2
Maintenance upgrades. There is no functional difference to the previous versions. For normal playing and testing, the existing binaries can be used.
- Upgrade to Rust Edition 2021
- Upgrade 'rand' to 0.8.5
- Upgrade 'rand_chacha' to 0.3.1
- Upgrade 'if_chain' to 1.0.2
- Upgrade 'clap' to 3.2.8
- Upgrade 'crossbeam-channel' to 0.5.5
- Upgrade 'crossbeam-utils' to 0.8.10 (security fix)
- Upgrade 'rand_core' to 0.6.3 (security fix)
November 6, 2021 - Rustic Alpha 3.0.1
Bugfix upgrade. There is no functional difference to the previous version. For normal playing and testing, the binary of version 3.0.0 can be used.
- Fixed a variable having the wrong type. This caused the "extra" module failing to compile.
June 18, 2021 - Rustic Alpha 3.0.0
- New features:
- Killer Moves
- Principal Variation Search (PVS)
- Changes:
- Switch versioning scheme to SemVer. Versions are going to be in the
form "a.b.c" from now on, with the following meaning:
- Increment a: A new strength-gaining feature was added.
- Increment b: A bug was fixed that gained strength.
- Increment c: A feature was added or a bug was fixed that did not gain stregnth. It is not necessary to test this version for a rating change.
- Switch versioning scheme to SemVer. Versions are going to be in the
form "a.b.c" from now on, with the following meaning:
- Misc:
- Updated crossbeam-channel to version 0.5.1
- A Makefile was added, so Rustic can be built using "GNU Make". When typing "make" (or "gmake" in MacOS), the Makefile will build all Rustic versions for the platform it's being compiled on.
- Re-add showing the size of the TT and number of threads in About.
- Fairly large update of the book on https://rustic-chess.org/.
Series Alpha 2
March 28, 2023 - Rustic Alpha 2.2
Maintenance upgrades. There is no functional difference to the previous versions. For normal playing and testing, the existing binaries can be used. (Unless you run the engine on the command-line and REALLY want to see the settings in the About banner. Then you will need to compile a new version.)
- Update About banner with Hash and Threads settings
- Upgrade 'rand_core' to 0.6.4
- Upgrade 'clap' to 4.1.14
- Upgrade 'crossbeam-channel' to 0.5.7
- Upgrade 'crossbeam-utils' to 0.8.15
June 11, 2022 - Rustic Alpha 2.1
Maintenance upgrade. There is no functional difference to the previous version. For normal playing and testing, the existing binaries can be used.
- Upgrade to Rust Edition 2021
- Upgrade 'rand' to 0.8.5
- Upgrade 'rand_chacha' to 0.3.1
- Upgrade 'if_chain' to 1.0.2
- Upgrade 'clap' to 3.2.8
- Upgrade 'crossbeam-channel' to 0.5.5
- Upgrade 'crossbeam-utils' to 0.8.10 (security fix)
- Upgrade 'rand_core' to 0.6.3 (security fix)
March 17, 2021 - Rustic Alpha 2
- New Features:
- Transposition table for search and perft.
- Ordering on transposition table move.
- Set TT size through --hash option or UCI parameter.
- Improvement:
- Move check extension higher up in the search routine, to prevent quiescence search while in check.
- Changes:
- seldepth: report max ply reached during the search, instead of selective depth at last completed iteration.
- Count all nodes visited, instead of only nodes which generated moves.
- Change random number generator from SmallRng to ChaChaRng for reproducible behavior between platforms/OS's/architectures/versions.
- Cleanup
- Change Root PV handling to remove redundant code.
- Miscellaneous small renames, refactors, and cleanups.
- Add rand_chacha and remove SmallRng number generators.
- Update Rand library to 0.8.3.
Series Alpha 1
March 28, 2023 - Rustic Alpha 1.3
Maintenance upgrades. There is no functional difference to the previous versions. For normal playing and testing, the existing binaries can be used.
- Upgrade 'rand_core' to 0.6.4
- Upgrade 'clap' to 4.1.14
- Upgrade 'crossbeam-channel' to 0.5.7
- Upgrade 'crossbeam-utils' to 0.8.15
June 11, 2021 - Rustic Alpha 1.2
Maintenance upgrades. There is no functional difference to the previous versions. For normal playing and testing, the existing binaries can be used.
- Upgrade to Rust Edition 2021
- Upgrade 'rand' to 0.8.5
- Upgrade 'rand_chacha' to 0.3.1
- Upgrade 'if_chain' to 1.0.2
- Upgrade 'clap' to 3.2.8
- Upgrade 'crossbeam-channel' to 0.5.5
- Upgrade 'crossbeam-utils' to 0.8.10 (security fix)
- Upgrade 'rand_core' to 0.6.3 (security fix)
March 15, 2021 - Rustic Alpha 1.1
This is a bugfix release. Alpha 1 lost all of its games on time forfeit when playing in MoveTime mode (for example, when playing seconds/move).
Bugfixes:
- Do not exceed alotted time in MoveTime mode.
January 24, 2021 - Rustic Alpha 1
This is the initial release. Below are the features included in this version.
- Engine:
- Bitboard board representation
- Magic bitboard move generator
- UCI-protocol
- Search
- Alpha/Beta search
- Quiescence search
- MVV-LVA move ordering
- Check extension
- Evaluation
- Material counting
- Piece-Square Tables
Appendix
The binary system
So what is a bitboard? Basically, a bitboard is the bit-wise representation of a number. To be able to understand this, we'll have to look into how the binary system works. Here's a quick recap.
First, consider the decimal system. It consists of the digits 0-9, so we can count 0, 1, 2, 3... up to and including 9. Most people will just keep counting without thinking about it: 11, 12, 13 ... 19, 20. This is what we are doing every day and we have become so used to it that most of us don't really think about what is really happening. Let's take a closer look.
First we count up to and including nine:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
So what comes after 9? Most of you will say "ten", which is written as "10"... but there is no digit "10" in the decimal system. So what has happened? In the decimal system, the digits above (0-9) are units. It's called the decimal system because we have 10 digits in it. ("deci" means "ten".) Should we need more than 9 units, we can create new numbers by adding another position to the left of the number:
0 units
1 units
2 units
...
9 units
10 (one set of "ten", zero units)
11 (one set of "ten", one units)
12 (one set of "ten", two units)
...
19 (one set of "ten", nine units)
20 (two sets of "ten", zero units)
98
99
100 (one set of "ten * ten, or hundred", 0 sets of "ten", zero units)
101 (one set of "ten * ten, or hundred", 0 sets of "ten", one unit)
So, if we run out of numbers after 9, what we do is to start on 1 again in the next position and everything after that becomes 0. The binary system, which means "system of two", works in exactly the same way, but now we only use the digits 0 and 1:
0: 0 (zero units)
1: 1 (one unit)
// We ran out of numbers...
2: 10 (one set of "two", zero units)
3: 11 (one set of "two", one unit)
We are out of numbers again...
4: 100 (one set of "two * two", zero sets of "two", zero units)
5: 101 (one set of "two * two", zero sets of "two", one units)
6: 110 (one set of "two * two", one set of "two", zero units)
7: 111 (one set of "two * two", one set of "two", one units)
8: 1000 (one set of "two * two * two" ...)
If you look closely, you can see the pattern of of 2, 2*2, 2*2*2... and you can image that, if we need to add another digit, the pattern just repeats. If we take a 4-bit number "1111", the values of each bit are therefore as follows:
value = 8 4 2 1
bits = 1 1 1 1
number = 8 + 4 + 2 + 1 = 15
Now we can count in the binary system:
decimal bit-wise calculation
8 4 2 1
--------------------------------------------
0 0 0 0 0 0
1 0 0 0 1 1
2 0 0 1 0 2
3 0 0 1 1 2 + 1 = 3
4 0 1 0 0 4
5 0 1 0 1 4 + 1 = 5
6 0 1 1 0 4 + 2 = 6
7 0 1 1 1 4 + 2 + 1 = 7
8 1 0 0 0 8
9 1 0 0 1 8 + 1 = 9
10 1 0 1 0 8 + 2 = 10
11 1 0 1 1 8 + 2 + 1 = 11
12 1 1 0 0 8 + 4 = 12
13 1 1 0 1 8 + 4 + 1 = 13
14 1 1 1 0 8 + 4 + 2 = 14
15 1 1 1 1 8 + 4 + 2 + 1 = 15
// and so on
I'm sure you can now figure out what the decimal number "16" should become in binary representation. If not, you should go back and read this page again; if necessary, find some more information on the internet and practice a bit with this subject matter. Understanding this perfectly is absolutely essential to be able to write a bitboard based chess engine. Counting in the binary system and combinding numbers (or more correctly, sets of bits) with one another, comes up over and over again.
Bitwise operations
Bitwise operations are the bread and butter of a bitboard-based chess engine. This chapter explains how these operations work. A chess engine uses 64-bit integers, but to demonstrate the workings we're going to use an 8-bit integer. This will save writing out long strings of zeroes. For the demonstration we'll use the same number thoughout:
decimal: 36
binary: 0010_0100
In a bit-wise representation as shown above the most significant bit is on the left, just as the most important digit is on the left in the decimal system. This means that the least important bit is on the right. These are normally called the Most Significant Bit (MSB) and Least Significant Bit (LSB).
Note: When talking about bit sets, the Least Significant Bit (LSB) on the right can be called "First bit" or "bit 1", referring to the fact that it is the first bit in the set. It can also be called "0th bit" or "bit 0", because it is at location 0 in the set.
This can be very confusing, so keep this in mind. I try to be consistent and talk about bit location instead of bit number, so in the case of this book, the LSB is called "the bit at location 0", or "bit 0". It has the advantage of not having to think about subtracting 1 from the bit number all the time, to know at which location a bit is.
When reading articles about bitwise operations and manipulations, just make sure you know if the author is talking about bit locations (LSB = bit 0) or bit numbers (LSB = 1st bit / bit 1).
Left and right shift
This operation is also often called LSHIFT and RSHIFT. What it does is shift the bits to the left or the right by a given number of positions. Note that this can cause bits to "drop out" of the integer. If you shift the bits so many positions that the integer is too small to hold them (as in, there are less bits in the integer than what would be required for the shift to work), then they will be lost. Note that you can't get them back again by shifting in the opposite direction.
Note: there is also a "wrapping shift", where a bit that would normally drop off of the integer on the left doesn't get lost, but appears again on the right. It also works the other way around of course. This is not discussed here, because it is only used rarely in chess engine. I mention it so you know it exists when you encounter it.
fn main() {
let mut number: u8 = 36;
println!("Starting situation.");
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Shift left by two bits.");
number = number << 2;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Shift right by one bit.");
number = number >> 1;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
// Note the alternative, shorter notation of the operator.
// If you want the result to end up in the same variable
// as the one you are shifting, you can combine the operator
// and the assignment. If you do not want the original number
// to change and catch the result in a different variable, you
// will have to use the longer notation shown above.
println!("Shift right by one bit. We're back at the starting number.");
number >>= 1;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Shift left by 3 bits.\nThe left-most 1 will drop off the number.");
number <<= 3;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Shift right by 5 bits.\nThe lost bit is not coming back!");
number >>= 5;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Shift right by 1 bits.\nNow we lost all the bits.");
number >>= 1;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
}
If you compile and run the above code, this is the result:
Starting situation.
decimal: 36
binary: 00100100
Shift left by two bits.
decimal: 144
binary: 10010000
Shift right by one bit.
decimal: 72
binary: 01001000
Shift right by one bit. We're back at the starting number.
decimal: 36
binary: 00100100
Shift left by 3 bits.
The left-most 1 will drop off the number.
decimal: 32
binary: 00100000
Shift right by 5 bits.
The lost bit is not coming back!
decimal: 1
binary: 00000001
Shift right by 1 bits.
Now we lost all the bits.
decimal: 0
binary: 00000000
Bitwise NOT (Invert bits)
This is one of the easier operators to understand. It inverts all the bits in set: a 1 becomes a 0 and vice versa. In Rust the bitwise NOT operator is !, while in many other programming languages it is ~. This is how its used:
fn main() {
let mut number: u8 = 36;
println!("Starting situation.");
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Invert all the bits.");
number = !number;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Invert them back again.");
number = !number;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
}
And the result of this code is:
Starting situation.
decimal: 36
binary: 00100100
Invert all the bits.
decimal: 219
binary: 11011011
Invert them back again.
decimal: 36
binary: 00100100
That's it for the bitwise NOT operator.
Bitwise OR (Set bits)
The bitwise OR operator checks if either the bit within the bit-set or the bit within the mask is enabled. This operator is represented in Rust by | (the single vertical pipe). It has the following truth table:
bit | OR | mask | result |
---|---|---|---|
1 | | | 1 | 1 |
0 | | | 1 | 1 |
1 | | | 0 | 1 |
0 | | | 0 | 0 |
As you can see, the operator returns 1 if one of the bits is set. It only returns 0 if both bits are not set. This can be used to turn bits in a set from 0 to one.
Note: Bit-sets in a chess engine are mostly 64-bits. If we are setting or checking for example the bit at location 59 in a set, we would use this notation:
number & (1u64 << 59);
What we are effectively doing is creating a huge number by getting a 64-bit representation of the number 1. We then shift this 1, which is the least significant bit (and the only one in the mask) by 59 locations to left. With this notation it is very clear that we just created a mask to target the bit at location 59 in the set. Instead of the shift, we could also have AND-ed with the number "576460752303423488", but in that way it is almost impossible to see what we are doing. This would make your code completely unreadable.
If we wanted to create a mask that targets positions 59 and 30 in a set, we would do this as follows:
number & ((1u64 << 59) | (1u64 << 30));
On the right side of the &, we first create the mask for the bit at location 59, and within this mask, we also set the bit for location 30 using the bitwise OR operator discussed above.
Here is a series of examples on the workings of this operator:
fn main() {
let mut number: u8 = 36;
let mut mask: u8;
println!("Starting situation.");
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Set the bit at location 0");
mask = 1u8 << 0;
number = number | mask;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
// Note the alternative notation of
// the operator, which can be used if
// you want the result to end up in
// the same variable.
println!("Set the bit at location 1");
mask = 1u8 << 1;
number |= mask;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Set number to 0 to clean up");
number = 0;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Set bits at location 0 and 7 (first and last bit)");
mask = (1u8 << 0) | (1u8 << 7);
number |= mask;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
}
The output of this code is going to be:
Starting situation.
decimal: 36
binary: 00100100
Set the bit at location 0
decimal: 37
binary: 00100101
Set the bit at location 1
decimal: 39
binary: 00100111
Set number to 0 to clean up.
decimal: 0
binary: 00000000
Set bits at location 0 and 7 (first and last bit)
decimal: 129
binary: 10000001
There is nothing more to it. If you have a bitset and then bitwise OR this with a mask where one or more bits are enabled, then those bits will be enabled in the bitset as well (or stay enabled if they where already 1).
Bitwise AND (Get bits)
It often happens that you want to know if a certain bit is set or not. This way you can determine if a piece is on a certain square or if a square is empty or not. This table shows the results of combining a bit with a mask using the AND-operator:
bit | AND | mask | result |
---|---|---|---|
1 | & | 1 | 1 |
0 | & | 1 | 0 |
1 | & | 0 | 0 |
0 | & | 0 | 0 |
A set bit will return 1 if AND-ed with a 1; if it is not set, or AND-ed with a 0, it will return 0.
Let's use our trusty number 36 again, and explore the workings of the bitwise AND-operator some more.
fn main() {
let number: u8 = 36;
let mut mask: u8;
let mut result: bool;
println!("Starting situation.");
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Let's see if the bit at location 0 is set.");
mask = 1u8 << 0;
result = (number & mask) != 0;
println!("Bit at location 0 is set: {result}\n");
println!("Now check the bit at location 2 (third bit from the right).");
mask = 1u8 << 2;
result = (number & mask) != 0;
println!("Bit at location 2 is set: {result}\n");
println!("What about BOTH bits at location 2 and 5?");
mask = (1u8 << 2) | (1u8 << 5);
result = (number & mask) == mask;
println!("Both bits at location 2 and 5 are set: {result}\n");
println!("What about BOTH bits at location 2 and 6?");
mask = (1u8 << 2) | (1u8 << 6);
result = (number & mask) == mask;
println!("Both bits at location 2 and 6 are set: {result}\n");
println!("Is AT LEAST one of the bits at location 2 or 6 set?");
mask = (1u8 << 2) | (1u8 << 6);
result = (number & mask) != 0;
println!("At least one of the bits at location 2 or 6 is set: {result}\n");
}
The output of this code will be:
Starting situation.
decimal: 36
binary: 00100100
Let's see if the bit at location 0 is set.
Bit at location 0 is set: false
Now check the bit at location 2 (third bit from the right).
Bit at location 2 is set: true
What about BOTH bits at location 2 and 5?
Both bits at location 2 and 5 are set: true
What about BOTH bits at location 2 and 6?
Both bits at location 2 and 6 are set: false
Is AT LEAST one of the bits at location 2 or 6 set?
At least one of the bits at location 2 or 6 is set: true
Bitwise XOR (Toggle bit)
The Exclusive OR operator, abbreviated as XOR, can be used to toggle bits back and forth between disabled and enabled. This operator is represented by ^ and it has the following truth-table:
bit | XOR | mask | result |
---|---|---|---|
1 | ^ | 1 | 0 |
0 | ^ | 1 | 1 |
1 | ^ | 0 | 1 |
0 | ^ | 0 | 0 |
As you can see this operator returns 1 if either the bitset or the mask have a 1. If both have a 1 or both have 0, the operator returns 0. Therefore a bit can be toggled from 0 to 1 and from 1 to 0 by the same mask, as demonstrated by the code below:
fn main() {
let mut number: u8 = 36;
let mut mask: u8;
println!("Starting situation.");
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Toggle the bit in location 2 from 1 to 0");
mask = 1u8 << 2;
number = number ^ mask;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
println!("Toggle the bit in location 2 from 0 to 1");
mask = 1u8 << 2;
number = number ^ mask;
println!("decimal: {number}");
println!("binary: {number:08b}\n");
}
The output of this code is as follows:
Starting situation.
decimal: 36
binary: 00100100
Toggle the bit in location 2 from 1 to 0
decimal: 32
binary: 00100000
Toggle the bit in location 2 from 0 to 1
decimal: 36
binary: 00100100
Unset bit
Sometimes bits in a set need to be disabled. To accomplish this, two operators must be combined: the bitwise NOT (!) and bitwise AND (&) operator. This operation is somewhat more difficult to understand than the others, so it requires a bit of explanation. Let's say we have a set of four bits, with the LSB and MSB enabled:
bitset = 1001
Now we want to disable the most significant bit (the left-most one), so we create a mask that targets this bit:
mask = 1000
However, there is no "unset bit" operator. Let's run through some of the available operators. If necessary, refer to the truth tables discussed earlier in this chapter.
We can't use AND, because the left-most bit is enabled in both the bitset and the mask. In addition, the least significant bit will be disabled:
1001
1000
----- &
1000
Oops. We disabled the wrong bit.
We also can't use NOT, because this will just invert all the bits:
!(1001) = 0110
That is not what we want. Both the MSB and LSB are disabled, and the two bits in the middle are now enabled.
Let's see about the OR operator:
1001
1000
----- |
1001
Meh. Nothing happened, so this can't be right.
XOR then?
1001
1000
----- ^
0001
Hey, this worked! I can hear you thinking: "So why would we need to do weird stuff? XOR works fine. If a bit is set, it will disabled it." Yes, true enough... but now consider that we want to unset a bit regardless if it was set or not. Let's take our bitset 1001, but now we want to unset the bit in location 2, even if it is unset already. Our mask will be be 0100, which targets the bit in location 2.
1001
0100
---- ^
1101
Oops. The bit in location 2 was already disabled, and XOR flipped it to enabled. XOR can only be used to disable a bit if we know it is enabled. So how do we disable a bit, even it is disabled already? We combine bitwise AND and bitwise NOT. First, we review what happens if we are disabling a bit that is enabled at the start:
bitset = 1001
mask = 1000 (disables MSB)
1001
!1000
------ &
?
Flip the mask because of bitwise NOT:
1001
0111
----- &
?
And now we can solve the AND:
1001
0111
----- &
0001
As you can see, this works. The MSB is disabled, while the LSB was left untouched. Now let's try to disable an already disabled bit:
bitset = 1001
mask = 0100 (disables bit in location 2)
Note that the bit in location 2 is already disabled.
1001
!0100
------ &
?
Flip the mask because of bitwise NOT:
1001
1011
----- &
?
And now we can solve the AND:
1001
1011
----- &
1001
The bitset has not changed, which is correct: we created a mask to disable the bit in location 2. This bit was already disabled and after this operation it was still disabled. The other bits were not touched, so the bitset stayed unchanged. Thus the "unset bit" operation were a mask disables a bit in a bitset can be summarized as:
x = bitset & (!bitmask)
Where "x" will contain the new bitset.
or shorter, if you want to use the same variable for the result:
bitset &= !bitmask
This works because the bitwise NOT operator takes precedence
over bitwise AND, so it will be executed first.
Conclusion
There you have it: an explanation of the bitwise operations that are used within a bitboard-based chess engine. Make sure you understand this material perfectly, because without it you will struggle to understand what a bitboard-based chess engine is all about.
For the extremely curious and brave, you could look into the mathematical field this topic belongs to: Boolean algebra. You are warned before embarking on that journey though: There be monsters, and insanity is around every corner... (which is to say that this topic can become extremely complex.)
Ramblings
Welcome to the new Ramblings section. In here you will find some thoughts on subjects that have to do with the development of Rustic and chess programming in general, but which are not part of the main book. If there is something mildly interesting to share about Rustic's development which cannot be fitted in one of the main chapters then you'll find it here.
2023-03-14 - Picking up again
2023-03-15 - A new computer
Picking up again
You will probably have noticed that there hasn't been a new major version of Rustic since June 2021 and may be wondering if the engine and this book are still in active development. I can assure you that the engine is still being developed. What is more: I've been maintaining all current Rustic versions to compile without errors or warnings using the latest version of Rust, and the latest versions of the libraries it depends on. Older versions will not receive new functionality, but they will receive bugfixes or changes to the code if a library or language update requires this.
I think that this is important to maintain the older versions, because while there are lots of open source engines, many of them are either buggy, unstable, or unable to compile with newer compilers and libraries. This makes testing very hard for newcomers to the chess programming community. It isn't fun having to fix 3, 5, 10 or 15 year old code first before it can be compiled correctly. By maintaining older Rustic versions and preventing them from code-rotting, I hope that there will at least be some engines in the lower rating regions to test against that can be compiled without issue if necessary.
So why do work on maintaining old versions instead of releasing version 4?
The reason is simple: life got in the way. Working on, and testing a new version takes much more work and energy than maintaining the existing versions, and refactoring the upcoming version to improve the code.
At work we had a massive rewrite of a part of our software due to country-wide rule and procedure changes made by the Dutch government. This had to be completely implemented in the second part of 2021 to obtain the minimum viable product. Most of 2022 was used to debug and fix what was written, and implement all the needed additional functionality. That took lots of work and testing and some overwork as well. It left less time for recreational activities. I chose to spend my free time away from the computer after programming for work for 8-10+ hours a day.
In the second half of 2022 me and my girlfriend moved from our appartment to a house in a different part of the city. Obviously that is also not something you'll do between lunch and dinner... for a big chunk of this time I didn't have my computer available and only used my backup laptop.
In addition to maintaining the older versions I also slowly added the still missing functionality to finish support for the XBoard protocol. After this is done, only the tuner needs to be written to complete version 4.
So, is it the end of Rustic? Au contraire, mon Capitan. Heee's baaack!
A new computer
In 2023 my current computer will be 7 years old. It was built in the second half of 2016 and it has four cores and 32 GB of RAM. To be honest, for all of my day-to-day use this system would still be sufficient. I do not play many games: The Witcher 3 is still the most demanding game I own (if you don't count the next-gen update), and it runs fine on this system. If it hadn't been for chess programming, I wouldn't have considered an update for another 1-3 years.
However, I did get into chess programming, and the amount of required testing and tuning is slowly increasing. I have to test each version of Rustic in self-play, but also run the new version in a gauntlet against other engines. Because I wish to run at least 1000 games for each engine Rustic plays against, this gauntlet takes around 8 hours on this computer. It can run 4 games at once (one per core, as I prefer not to use hyperthreading with such a low core count), and it clearly isn't enough. I don't want to wait for 8 hours plus the self-play test to see the result of a change. Also, I don't (yet) want to be dependent on something like OpenBench.
Therefore I decided to build a new workstation. At first, the plan was to build this computer shortly after the release of Debian 12 (I switched to Debian Linux permanently in 2021), which would be somewhere in July or August 2023. The computer hardware world seems to have turned crazy, at least in the Netherlands. From one day to the next, prices on motherboards can fluctuate by as much as €150 up or down, even though the GPU's are a bit more stable since the last two years. Parts can go from available everywhere on Tuesday, to completely unavailable on Wednesday. I've decided to build the computer earlier and run Debian 12 Bookworm Testing on it right now and have it roll into Stable when it releases. So I've been buying parts the last few weeks, when they were available and the prices were good.
For the first time in 20 years I've decided to do an all-AMD build instead of the usual Intel / nVidia I go for. The reasons are:
- As of the time of writing, AMD graphics cards have better open source support than nVidia cards (driver is in the Linux kernel, and they support the Wayland protocol better).
- Intel only has CPU's with 8 Performance cores. The rest are Efficiency cores, which are not useful when testing chess engines. In contrast, AMD has CPU's that have 6-16 normal (Performance) cores. (They also have CPU's with more cores, as has Intel, but those are outside my budget.)
- AMD CPU's can be limited with regard to power draw. Reviews and tests indicate that limiting the power draw of an AMD CPU will cost less than 2% performance. Effectively it can reduce power consumption by 40-50% ro 2% performance loss. I'm willing to make that trade.
- AMD has a history of supporting their sockets longer than Intel. If they'd release an AM5 CPU in 5 years that is a real improvement over what I'm buying now, I might just upgrade the CPU only.
After reading many reviews, I decided on getting an AMD 7950X, because it will decrease both my power consumption and testing time compared to my current system. The stats come down to:
- Testing time is down from 8 hours to 2 hours
- Testing time saved is 75%
- 8h x 95W - 2h x 140W = 760 - 240 = 520W saved
- Power draw for a test run is down by 70%
The computer I'm building should easily last me 7-10 years again, maybe with a CPU-upgrade in between. If I'm still developing Rustic 10 years from now (which I doubt, because I'll probably have declared it 'finished' before that) I'll see what needs to be done.
On idiomatic Rust
I'd like to say a few words about writing idiomatic code. Every programming language has its own way of doing things. Some programming patterns occur over and over again and are considered the 'correct way.' This is called "idiomatic code", and most programmers try to follow the idioms of the language they are writing in.
So it is with Rustic. When I started this engine, I didn't know a lot about Rust, but I was well versed in C. Therefore if you look at early versions of Rustic, you can see many things of which more experienced Rust programmers would say: "This is not idiomatic Rust; this looks like C." They would be correct.
Over time, Rustic became more idiomatic: more and more C-like constructs where removed and replaced by Rust features. Consider for example the printing of a struct's values to the screen in a pretty way, or converting such a struct into a string, so it can be appended to the end of another string. To achieve this, I implemented ".print()" and ".to_string()" for some structs. Sometimes there were even completely seperate functions for printing, carried over from the very first versions of the engine.
The idiomatic way to do this is to implement the trait "Display". This has the advantage that the struct can be printed, but it also automatically gets you the ".to_string()" function for free.
For Rustic 4, a big push has been made to make the engine as idiomatic as possible without compromising its speed, readability or maintainability. I can hear you saying: "Make it as idiomatic as possible? Why not make everything idiomatic, and why would that harm the engine's speed, readability and maintainability?" Allow me to try and explain with an example.
Rust is statically typed and has great type saftey. This means that you cannot assign a value of type X to a variable that is of type Y, without casting from X to Y first. Rustic tries to take advantage of this as much as possible. However, in a chess program, both pieces and squares are often (but not always) represented by integers. This means that these can be swapped around when calling a function. They can even be swapped if you use a type alias, because a type alias is just that: another name for the same thing. The compiler does not make any difference between two aliases such as Square or Piece which are the same thing underneath.
This can be dangerous when refactoring, because you could potentially put a "Piece" variable in a spot where a "Square" variable should have been, which will break the engine. It would be best to create new types to counteract this problem.
The proper way to create new types in Rust is the "NewType" pattern, where types are created by wrapping simple types into a struct or enum. It is possible to do this with squares and pieces. However, it will later turn out that bitwise operations need to be done on these values, constantly, in many different places. Because we wrapped the integer into a struct or an enum, we would need to implement all the bitwise operators for these new types, so they can be used as if they were integeres. What we are basically doing is re-implementing one of the integer types from scratch.
This makes the code very verbose, hard to understand and harder to maintain, because it is an extra struct/enum layer on top of the integer type we need. Therefore I did not do any of these wrappings to keep all the integer types seperated and chose to just keep my eyes open and make sure I don't swap (for example) squares and pieces.
In the future, tests will be added to the engine so that, after a refactoring effort, the every function can be tested automatically. If there are problems they can then be pinpointed on a function-by-function basis.
Another place where you will encounter an "old-fashioned" way of doing things is when Rustic iterates over the move list. It uses a for-loop, instead of an iterator. The reason is that the this list is an array backed with a counter, so it can be stored on the stack. (In case of a Vec, it would be stored on the heap, which is slower.) When I implemeted Iterator for MoveList the engine's speed dropped by 10% because of the Some/None check that comes with an iterator. Therefore I just use a for-loop.
I'm always open for suggestions to improve the engine in such a way that it adheres more to idiomatic Rust, as long as it doesn't slow down the execution speed, doesn't hurt readability, or maintainability. Please take into account that, because of these requirements, Rustic sometimes doesn't follow idiomatic Rust by choice.
Building Rustic
As Rustic is open source, anyone can build their own executable or fork the engine at any point in the development tree, to use it as a basis for their own chess engine. To be able to create the executable, you need a proper build environment.
Even though Rustic was developed on Windows, the GNU target is the default instead of MSVC. As a shell, MSYS2's BASH version was used with all the default Unix command-line tools available. Therefore it should be possible to build Rustic in any Unix-like environment for which Rust is available, and under MSYS2 in Windows.
Rustic has a Makefile available since Alpha 3.0.0, which creates all the executable versions for the operating system you are building on. At the end of the page some hints on alternatives are included.
Installing the environment
Under most Unix-like operating systems such as Linux, the build environment is already set up by default, especially if a BASH-compatible shell is being used. To build software in Unix-style, Windows needs MSYS2 installed. MacOS needs an updated version of Make.
Windows
First, download the Rust installer for Windows:
Run through the installation. Make sure you select the GNU target as the default. This is the target used to develop and test Rustic.
Second, download and install MSYS2. Installing GCC and Clang is not strictly necessary, but if you are compiling Rustic yourself there's some chance you may want to build other chess engines as well. As most engines are written in C or C++, it's very useful to have GCC and Clang installed for compiling those engines.
Download the installer: https://www.msys2.org/#installation
Run the setup.
Exit any console that may pop up after install.
Start the MSYS2 - MSYS terminal (for maintenance)
Right-click the title bar and select Options...
Set up the terminal preferences how you want them.
Run "pacman -Syu <enter>"
Close the terminal when asked.
Keep repeating the previous two steps until pacman says:
"Nothing more to do".
Start MSYS2 - MinGW64 (for 64-bit compiles)
Run these commands to install GCC, Clang, and Make:
pacman -S mingw64/mingw-w64-x86_64-gcc
mingw64/mingw-w64-x86_64-clang
pacman -S mingw64/mingw-w64-x86_64-make
ln -s /mingw64/bin/mingw32-make.exe /mingw64/bin/make.exe
Browse to the folder where you installed MSYS2.
In my case: "C:\Programs\MSYS2"
Open "msys2_shell.cmd" in your favorite text editor.
Uncomment the line: "set MSYS2_PATH_TYPE=inherit" and save.
Close and restart any MSYS2 shell that may have been open.
If you want to build a 32-bit version fo Rustic for some reason, you can follow the above steps, but for the "mingw32" shell instead of "mingw64". You'll obviously need to install the "mingw32", and 32-bit versions of the above packages, in the mingw32 MSYS2 shell.
Linux (and probably most other Unix-like systems)
Linux does not need a lot of setup. It is recommended you install the latest Rust-version from their website, as the ones in the distribution's repository can sometimes be several versions behind. You can find the installation instructions here:
https://www.rust-lang.org/tools/install
In Linux the rest of the build requirements (Bash, make, strip) are already installed by default. You can test this using the following commands:
$ bash --version
$ make --version
$ strip --version
Install these if they aren't already. As there are so many Linux distributions, it is out of the scope of this page to describe the installation for these tools.
MacOS
Just like Linux, MacOS doesn't need much setup because it is a Unix-like operating system. First, Install Rust for MacOS. MacOS has a version of GNU Make installed by default, but this version is too old to be used for Rustic's Makefile. It is recommended to install Homebrew, and then use this to install the latest GNU Make version as "gmake."
- Install Homebrew
- Install GNU Make
brew install make
- After installing GNU Make, check the version:
gmake --version
You can build Rustic by typing "gmake" and pressing Enter. If this doesn't work for some reason, check the "If make doesn't work" section below.
Compiling and building
If the build environment has been installed correctly, it should be easy to compile and build the engine. There are two ways of getting the source code onto your system: git and download/unzip.
It is beyond the scope of this page to fully describe how Git has to be installed and configured. There are many tutorials around the internet providing this information. If you are using Git, you can clone the engine in a directory on your system:
https://github.com/mvanthoor/rustic.git
If you don't use Git, you can also download the code and unzip the archive. Go to Rustic's GitHub Page and switch to the branch or tag you want to compile. Then download the zip-file.
After you have the engine in a directory on your system, switch to that directory in a terminal. (Go to the root directory, not inside the /src directory.) Now you should be able to type "make (enter)" on Windows or Linux, or "gmake" on MacOS. If all goes well, the Makefile will create a ./bin/ directory, with the operating system as a subdirectory, and put all the binaries in there.
If "make" doesn't work
If "make" / "gmake" don't work for whatever reason, you can also compile and build the engine manually. This is the same for each operating system (except for a small difference with the "strip" command on MacOS.).
In the terminal (MSYS2 on Windows, Terminal on Linux and MacOS), first export the correct environment variable to make Rust compile the engine for your CPU. From fastest to slowest:
native: export RUSTFLAGS = -C target-cpu=native
bmi2: export RUSTFLAGS = -C target-cpu=haswell
popcnt: export RUSTFLAGS = -C target-cpu=nehalem
old: export RUSTFLAGS = -C target-cpu=core2
ancient: export RUSTFLAGS = -C target-cpu=athlon64
generic: don't export anything
Obviously you can test multiple versions to see which one runs the fastest on your system. There may be some that don't run at all. The BMI2 version will not run on a Core2Duo CPU, for example. The BMI2 version will run on AMD Zen2 CPU's, but the popcnt version will be faster.
Compile and build the engine:
cargo build --release
Strip symbols from the executable:
Windows:
strip -s ./target/release/rustic.exe
Linux:
strip -s ./target/release/rustic
MacOS:
strip ./target/release/rustic
Copy and paste the executable from the ./target/release directory to a different location, and give it an appropriate name, so you know what version you built. My recommendation is:
rustic-($version)-($os)-($bits)-($cpu_type)($extension)
An example would be:
rustic-alpha-3.0.0-windows-64-bit-bmi2.exe
(In Linux or MacOS, you can leave the .exe extension out.) This naming convention is also Rustic's default when it's built by the Makefile.
On 32-bit versions
Because a (standard) chess board has 64 squares, the engine uses 64-bit integers for most of its calculations, especially for move generation. Even though Rustic can be built and run as a 32-bit engine, it will lose about half of its speed, and will be 50-70 Elo weaker. It is not recommended to use such a version for tournaments and matches. The only operating system where the 32-bit version is the default is Raspberry Pi OS, because this operating system is 32-bit. The 64-bit version of Raspberry Pi OS is still experimental at the time of writing.
Alternatives
On Windows, you can use the Rust MSVC target, but if you do, you will also need to install the C++ Build Tools, because this target needs the Microsoft Linker that comes with these tools. (Warning: the C++ build tools have a massive installation size.)
It is probably also possible to build Rustic on WSL2 (Windows Subsystem for Linux 2) and on Cygwin, by installing the respective versions of Rust and GNU tools there before either running "make" or "cargo build --release".
If built on Cygwin, the Rustic executable will depend on Cygwin1.dll. As this is an additional layer between the engine and operating system, this will lose some speed and thus playing strength.
I've not yet tried these ways of building the engine, so I have no data with regard to the performance or stability of these binaries. Cygwin will probably never be tried because of the dependency on Cygwin1.dll
Frequently asked questions
Why Rust? What's wrong with C and C++?
Nothing... and a lot, both at the same time. To be honest, I like C and C++; especially C, because it is such a small language, very easy to learn, and it basically runs on every computer platform ever created. C gives me the same sort of thrill as I get from handling a live piece of dynamite. No, really. I love C to bits. (I should stop making word jokes, I know.) If Rust hadn't existed, my chess engine may have been written in C and in that case would probably have had a different name.
These languages are the 800 pound gorilla's in the room with regard to systems programming, embedded software engineering, and also, chess programming. The reason is that these languages compile down to native machine code which runs at lightning speed, so they can be used to obtain maximum performance. Also, these languages have also been around for a long time. They are well known, and a lot of open source code is floating around the internet, ready to be used. While all of this can be an advantage, there are also some drawbacks.
Both C and C++ are inherently unsafe, because the programmer is allowed to do anything he or she wants. These languages are fast, but also dangerous. Even though C is easy to learn, it has a lot of pitfalls one should be careful about. Managing memory and resources correctly is hard, and has to be exactly right. If they aren't, it's easy to create memory leaks, subtle and really hard to find bugs, race conditions and deadlocks when programming multi-threaded. You could also crash the program.
There are other problems I could list, but most of them come down to the fact that C, as of 2020, was designed 47 years ago. Think about that. In the world of computers, 47 years ago is akin to prehistory. Even though newer versions of C and C++ are adding more modern features, they can never escape their (unsafe) roots completely.
One of the main reasons to avoid C or C++ when writing a chess engine specifically, isn't technical at all. In my case, it's personal. Everybody has a chess engine written in C or C++; using a different language is more the exception than the rule. I want mine to be written in a different language.
Preferably, that would be a language as fast as C or C++, but avoiding most or even all the pitfalls of those languages. I've looked at different options such as C#, Go, Ada, and others. In the end I chose Rust. Even though Rust itself is not a perfect language and it can't solve all the problems (it's still possible to create subtle bugs or crash the program), it goes a long way in making memory management and threading a lot easier.
As fast as C, but safe? I wanna know!
In short, the Rust programming language can be summed up as such:
- Statically typed with type inference. If a variable has a certain type, the compiler will enforce that type. If you use this type in combination with functions and other types, the compiler will often know by inference what the type of other variables should be, and enforces that too.
- Safe. The compiler statically checks the safety of the code, so you can concentrate on writing the software, instead of having to concentrate on safe coding. This component is called the "borrow-checker."
- Because of point 1, the following happens:
- No uninitialized variables... ever.
- No memory leaks.
- No race conditions.
- Multi-threading becomes fairly easy!
- No dependency problems: if you know and like any package manager (apt-get, yum, npm, pip), you'll love cargo. (However, if you hate all package managers, you'll hate this one too.)
- Truly cross platform: one code-base compiles and runs everywhere. You only need conditional compiles if you use specific features of a specific language.
- Easy building of the software. If you have Rust installed, the only thing you have to do is "cargo build --release", and cargo will download all the dependencies, compile them, and create the executable. If your Rust version is not too old, it will work.
- Completely modularized: no header files.
- Zero-cost abstractions.
- Great tooling (works with C/C++ debuggers and profilers).
- Last but not least: Speed.
- And many other features that make it a nice language to use.
What about this site?
Rustic-chess.org is Rustic's home on the internet. This site will document the entire development of Rustic, starting from the very beginning, and its progress after each new feature is added. As such, it will be possible to write one's own chess engine by following Rustic's development steps.
It looks weird. What software does it use?
Yeah. This might come as a shock, but it's not Wordpress. This site is written using MD Book. This is a command line utility written in Rust. It creates websites as books out of Markdown files. It's open source, written by and for the Rust language community, and it's used to create the Rust language documentation. I am using it because it allows me to write documentation which is then converted into a static website, so I don't have to maintain or update a content management system. Rustic-chess.org is purely informative. It is going to contain a lot of information, so it's priority is to be easy to navigate, very readable, and easy to maintain. The less work needs to be done to maintain it, the more I can either develop Rustic, or write its documentation.
Hasn't this been done before?
Of course, this has been done before. There are some well known tutorial engines around. Most of them do have some issues, for example:
- The engines and tutorials are getting older. The source code may be old enough to not compile cleanly using a current-day compiler.
- These engines mainly use older techniques, often for simplicity's sake. Those techniques work well, but because of speed limitations, the engine will always be at a disadvantage to an engine using more modern techniques if everything else is equal.
- Even though the engines may be available with (some) documentation in the source code, in some cases the accompanying website has been gone for a long time.
- The code isn't always as readable as it could have been.
- Most if not all engines are written in C or C++, and many of these engines use particular features (or even loopholes!) in those languages to get certain things done. That is not good programming practice.
Even if this has been done before, I still get fun out of doing it myself, in exactly the way I want to be done and in a programming language of my choice. Even though I'll be standing on the shoulders of giants that have shared information before me, this project is my own. Having my own chess engine makes it worthwhile to me. I hope, in the future, other people starting out to write a chess engine will find the information contained in this site useful.
About me
Hi again,
(Testing commit verification on Linux)
Right before the end of all things, I'll drop in a short scribble about myself. I'm sure you're just dying to read a rambling piece of text that's all about me, myself and I.
Oh. Right. Just let's forge on anyway :)
My name is Marcel, as you may already know from the preface. I'll mention it because you probably skipped the preface like everyone else. (Guilty as charged... I thought about not even writing it.) I hold a bachelor degree of applied computer science from Zuyd University of Applied Sciences, with specialization in software engineering and writing embedded software. My main occupation at the moment is writing backends for business software.
Even though it would be easy to spend quite some time talking about my other hobbies, such as reading, martial arts (I hold a black belt in Hapkido), photography, music and others, I'm sure the main stuff you're interested in is about chess and computers. Therefore I'll continue with that before you quit reading altogether.
I've been playing chess off and on since I was about 10 years old and I also got to work with computers since that time. Ever since I got my first chess computer and later, chess program on a PC, I wanted to write my own chess-playing software. At around 1994/1995, when I was just a teenager who had been jacking around with computers for a few years, I tried to do so in Turbo/Borland Pascal, using the only two useful books from the library: Borland Pascal 7.0 by Jeff Duntemann, and a book about chess computers I don't really remember.
This chess program partially succeeded. The program just had a command-line interface, so you had to put in your moves on the keyboard, just like the chess computers from the 70's. You also had to have own board present to see the actual position. It played legal chess (most of the time), but it was easy to crash and easy to defeat. Not blundering any pieces or mate was enough to win.
After I finished high-school and went to study computer science, playing chess as a hobby basically dropped by the wayside for quite a number of years, but I still wanted to write a chess engine "some day." It took me nearly 20 years to get around to it, because there were always more important things to do: study, work, moving, a girlfriend, sports... well, you know. There's no human alive who has never postponed anything. Also, I didn't want to write a chess engine in C or C++, because everybody else already had one in either one (or both!) of those languages.
Then at some point I heard about Rust as an alternative to C and C++. I started following the language, and serious thoughts about finally writing that chess engine cropped up again when it reached version 1.0 in 2015. While I was intrigued with the premise of this language and tinkered around with it, I found it somewhat cumbersome to use. The authors promised that this would get better. Three years later, Edition 2018 was released, and it did get better, by a lot. So in the summer of 2019, after my last move and my girlfriend having moved in with me, I decided to start writing Rustic.
Here it is, and this book/tutorial you are reading is its documentation. I hope you enjoy using this engine, or reading the book and the comments in the code, trying to learn how to write your own.
Kind regards, Marcel
Further study
Parts of this book are based on explanations given in other resources. As said on the About this book page, it is impossible for me to list every resource or webpage I ever visited while creating Rustic, but I can list materials I've found to be particularly useful. Some of that material may even have been created after Rustic's first version was finished. Often it was still useful to me, because it approached some topics at a different angle than I did myself.
Video tutorials
- Programming a chess engine in C (Richard Allbert / Bluefever
Software)
- Bitboard chess engine in C (Maksim
Korzh)
- Chess engine in Java (Software Architecture &
Design)
- Java chess engine tutorial (Logic Crazy)
Websites
- Chess Programming Wiki
- Magic Bitboards (Rhys-Rustad
Eliott)
- ICS 180, Strategy and Board game programming
(Eppstein)
Credits
There is one credit that doesn't have to do a lot with chess programming, but it can't be forgotten. Now I finally understand why book writers always have dedications to their spouses/children/families in their books.
- My girlfriend. Even though she is not a programmer, nor particularly interested in computers, I could not have written this chess engine without her. Just the fact that she's around when times are not the best, would be enough in itself, but she takes care of so much stuff in and around the house that without her, I wouldn't have even had the time to consider a project like this. In addition, she also helped in the development of the engine by listening to me blabbering about it, and then saying things like: "Maybe I'm asking a stupid question, but...",which helped me to avoid some stupid mistake more than once.
Many people on the Talkchess.com forum have provided insights and assistance that greatly helped in the development of this chess engine. Using their experience and information that has collected there over the years (and was posted as replies to my questions), I was able to either avoid many bugs, wrong implementations, or to improve existing code to work better, faster or cleaner. Thanks to all of these people. Below is a list of people who I particularly remember for one or more contributions (in no particular order).
- Richard Allbert (better known as BlueFever Software, author of VICE): his "Programming a Chess Engine in C" (or the Javascript version) series are legendary in the chess programming community. Richard may have instructed an entirely new generation of chess programmers.
- Terje Kirstihagen (author of Weiss): for the friendly Perft speed competition between an early version of Rustic and Weiss, to optimize make(), unmake(), and the move generator speed. And for his encouragement to keep going, by posting in my "Progress on Rustic" topic.
- Fabian von der Warth (author of FabChess): for giving helpful information about the Rust programming language, which were very useful in speeding up the early versions of the engine.
- H.G. Müller (author of FairyMax, MicroMax, and others): especially for explaining some improvements with regard to TT usage, and general assistance by patiently answering questions he must have seen countless times over the years.
- Maksim Korzh (author of Wukong and BBC): for writing a great video series in the same vein as the one written by Richard Allbert. While the series covers known ground, the perspective can be just from a different enough angle to make something 'click'.
- Rasmus Althoff (author of CT800): for assisting in optimizing Rustic's position repetition detection, and getting rid of some unneeded stuff in the alpha-beta and QSearch functions (and for providing many tidbits of useful information).
- Martin Sedlák (author of Cheng), and Eric Madsen (MadChess): Thanks for the pointers that put me on the right track to find out why TT Move sorting wasn't performing as expected.
- Ronald de Man (author of the Syzygy tablebases, CFish and RustFish): for helping to untangle the mess within my head regarding Principal Variation Search.
- Taimo (author of Monchester): for pointing out a potential variable underflow problem within the time management, that made Rustic crash in debug-mode.
- Sven Schüle (author of Jumbo, KnockOut, Surprise) for pointing out some lines of redundant, and thus confusing code in Rustic's search and qsearch functions.
- Thomas (Lithander, author of MinimalChess): for the engaging discussions regarding (chess) programming, and providing another stable engine with compiles especially for me to test against.
- Ed Schröder (author of Rebel and Gideon) and Robert Hyatt (author of Cray Blitz and Crafty): for still hanging around chess forums, answering questions, even after writing chess engines for 40 or 50 years.
I always wanted to be "one of them": the programmers who could write chess engines. I have always wanted to have my own, but there was always "something" that got in the way of writing it. Now it's done: even though it is not very strong yet, I wrote my own chess engine from scratch. It was a long process and lots of work, but thanks to the help of the chess programming community, it went relatively smoothly. Thanks guys.