Crafting the Chessiverse Bots

The goal of the Chessiverse bots is simple: they should play as humanly as possible. In this article we'll look at some of the key ways of how we're trying to achieve this, and where the challenges are.
Some Background
A chess engine consists mainly of two parts. A search and an evaluation.
The search explores lines down to some depth. There are many different ways of making the search more intelligent, like iterative deepening, transposition tables, null moves and so on. In general the search is meant to visit as many relevant positions as possible, in as short a time as possible. It's all about being efficient, while not missing things. As such, there is little "personality" in the search, it's more about raw strength. Lowering the search depth will make the engine weaker, but with unpredictable result.
When the search finishes a line, that is, when it reaches a leaf node in the search tree, it evaluates the position as is. Historically this was a big evaluation function that looks at the position and evaluated different parts of it. Things like material, king safety, space, mobility and so on. These parts were weighted and combined to give a final centipawn evaluation for the entire position.
The evaluation function contained most of a chess engine's "personality". For example valuing king safety lower, or mobility higher, would make the engine prefer different moves, and give it a certain playstyle.
In recent years, this big evaluation function has been replaced by a neural net. Basically a model trained on millions of positions, learning patterns of what makes a position good or bad so it can 'evaluate' them like a strong player.
Depending on what data goes into the model, it will, like the evaluation function, result in the engine developing a distinct playstyle. The difference is that it's very hard to know exactly what the result will be, and it's not possible to tweak it afterwards. The neural net is pretty much a black box that takes a position and spits out an evaluation.
Dumbing Down and Personalizing
So if you want to make a chess bot play in a certain way, tweaking the evaluation function seems to be the way to go. Adjusting the search depth will make the bot weaker, but messing with evaluation parameters is safer and more predictable.
This is how chess bots were historically created. Lower the search depth to get a rough strength, and then play around with the evaluation function.
It produced decent but unreliable results. A common complaint was that the bots played like a grandmaster until they suddenly lost a queen for no reason. This could be the effect of lowering the search depth since it's hard to tell the engine what to miss.
The Chessiverse Approach
We have gone with the neural nets approach. But rather than trying to make them play in a certain way, we let them live their own lives and measure them after the fact instead.
Basically throw some data at them, and see what comes out.
This might sound like a lottery that is hard to control, and it certainly is. But we don't try to control it. Instead we embrace it.
We use the neural net as the core. The personality, and the strength of the engine. Depending on the outcome, this might be a defensive beginner or a brilliant expert. To us it's all good, and this is the diversity we're looking for.
Taming the Neural Net
The neural nets we're working with are far from perfect. To create a weak neural net, using less data input is a good way. This creates a flawed understanding of the game, with sometimes unexpected or outright strange results.
As I mentioned before, the neural net is a black box. In go positions, out come evaluations. We have no way of adjusting it directly.
Instead, we treat it as a diamond in the rough. It has its personality and strength, but with rough edges. What we do is to wrap this in our own logic, smoothing out the kinks and pushing the bot in directions we want.
The Move Curator
We have different ways of affecting the neural net, without actually working with it directly. Let's look at one example.
We noticed that some of the neural nets we liked, unfortunately had a tendency to move their king very early in the game. They completely ignored king safety, and had no problem wandering out with the king in the opening.
While this can be a fun thing to happen once in a while for a weak bot, it does not feel very human.
So, we want to discourage this behaviour. To be able to do this we created a system we call the Move Curator. In short it lets the neural net generate multiple moves for a position, run the moves through a quite elaborate, ever-growing, filter, that picks up suspicious moves. These moves are then graded a considered by a stronger proven engine.
In the end, this allows us to promote more human moves, and it gives us the flexibility to push engines certain ways.
The Gramps Pushwick Method
For our Chess Club Challenge, we took the Move Curator to its extreme, and basically strongly discouraged any move that is not a pawn move. We find the result really fun, as the final boss, Gramps Pushwick, becomes a maniacal pawn pusher, but you can still see the personality shine through, and not every position has a somewhat sensible pawn move, so you get an interesting blend of furious pawn pushing, and more sensible moves.

Just One of Our Heuristics
The Move Curator is one of many little things we're doing behind the scenes to shape our bots. As you can imagine, this is a lot of work and it's certainly work in progress, which something we're proud of admitting. The Chessiverse bots will forever evolve and get more human the more we work on them.
PersonaPlay and How we Decide What Style a Bot has
One common misconception is that we are choosing what type of bot we want, and then tweak some parameters to make it play like that. For example, our PersonaPlay categorization groups bots in Guardian to Savage, that is, from solid to aggressive, depending on their playstyle.
The misconception is that when creating a new bot, we set a parameter to "play solidly" to achieve that. As I've mentioned before, we can't tell the neural net to do this.
We could potentially use our Move Curator system to achieve this, and we might try it in the future, but for now, we're fully relying on the inherent playstyle of the neural net.
This means that when it comes to playstyle, we do very little to influence it. Instead, we measure the output. Meaning we look at how the bot plays, by looking at thousands of its games, and then analyze the results. This was inspired by the new "style report" in Chessbase 18. Basically, we take a set of games and run statistical analysis on them, determining factors like tenacity and temperament.
Another way of putting it is, we didn't tell the bot how to play, we just let it play and observed how it did.
One thing we can do though, is choose what openings a bot will play. And here we have some freedom to create interesting combinations. A Guardian with a super aggressive opening repertoire is a quite interesting combination. This brings us to our next chapter.
Openings and Mimicking Humans Where we Can
Now on to something a bit more tangible. Every single bot on Chessiverse has its own opening repertoire (with one exception that I'll get back to).
We decided to put in the extra effort this takes, to make the bots feel as human as possible. If you play the same human over and over, you'll learn how it plays, you'll get a sense for their strengths and weaknesses, and you'll definitely start learning what openings they prefer, and in what style they choose them.
This is what we wanted for the Chessiverse bots. They have their own opening repertoire, crafted from human games at the rating they play at, so they definitely contain a blunder here and there. Some bots stick very true to their repertoire, while some deviate at first chance. Just like humans do.
We do have a specific improvement in mind for how the bots use their repertoires, and that's to be more likely to change up their lines on repeat games, especially if they lost. If you beat a human in a certain opening, they're likely to change it up in the next game. This is something that will be coming soon to the Chessiverse bots, keep an eye out!
Statistical Openings
I mentioned that there's an exception to every bot having their own repertoire, and that's our newly added statistical opening bots. What we noticed was that our users felt that the openings of the bots were not as varied as they would've liked.
The obvious way of solving this is to simply add more bots, and long term we will be doing that. However, after some calculations we realized that to get a good coverage of all openings, that would feel human no matter how many games you played, we would need on the order of thirty to forty thousand new bots (for anyone interested, I can probably dig out those calculations).
For now that was not feasible, so instead we came up with another solution: statistical opening bots. New bots that exactly follow what humans at their rating have done in every position. For example, if humans at this rating have played the French 15% of the time after 1. e4, then the bot will play the French 15% of the time.
In one way this is the most human a bot can get, they will play the opening on average exactly how humans do it. In another way, is it really human to play every single opening? We think there's room for both, and I think playing the statistical opening bots is really fun, give it a try!
How We Can Have Hundreds of High-Quality Bots
One common question is why we're not focusing on just a couple really high quality bots. Why are we spending time on creating hundreds and hundreds of bots? Isn't it better to focus our time on just making one bot great?
The answer is that with our approach, it's actually not that hard creating many bots, and in a way, we need many bots to cover all the strengths, playstyles and openings we want.
You can look at it as creating a lot of bots, and filtering out the promising ones, that we can then tweak to meet our standards.
I mentioned we needed thirty or forty thousand bots to cover enough openings. These numbers are not unreasonable, and we fully expect to have that many bots in a few years time.
Bots Are Held to Higher Standards
One thing that came as a complete surprise to me is how users react to mistakes bots make.
We've had users complain that the bot dropped a bishop, which felt inhuman, and when looking at the game, that same user had just dropped an entire queen in one move.
The bots are simply held to much higher standards when it comes to blunders or strange decisions. It doesn't feel human for a bot to drop a piece in one move, even if humans at that level does that in every game.
This is something I didn't expect, but it is something we have to account for, and are adjusting to.
Our bots have to blunder in smart ways. We need to make sure that when dropping a piece, the capturing piece was far away, or there was a fork, or a discovery, or at least some sort of mitigating circumstance.
This actually makes it quite a challenge to make our weakest bots even weaker. In the latest iteration, our weakest bot is Ken Knightly at 790 rating (read How Chessiverse Ratings Work to see what exactly that means). We are looking into making weaker bots than that, but given how high standards they have to follow, it's actually not trivial.
Trying to Give the Bots a Personality
We have played around with generating chat messages from the bots to give them a bit of added personality. With varied results.
The decision tree on how the chat messages are generated is incredibly complex, more so than I really would like to admit. It takes a whole load of input, like position evaluations, move times, historic move times, previous messages and on and on. And through that, it arrives at a set of possible answers with probability weights, and finally a message is selected.
This message is then decorated with an intricate tonality, which is unique for each bot. For example Ethan Snide's tonality contains
... Speaks in sarcastic riddles, repeats words for unnecessary emphasis ...
among a lot of other things.

Finally, all of this is sent to ChatGPT, and out comes a final message that is shown in the chat.
I'm the first to admit, that even though we spent a lot (too much probably) effort on this, right now we're not happy with how generic the chat messages sound and how uninteresting they become, and because of that most users simply ignore them.
We will make an effort to revamp them, which will most likely involve drastically reducing the number of messages. The goal will be that each chat message should be worth reading, and if it isn't it should be removed.
We'll see if we succeed with that.
The AI Art Elephant in the Room
Since this is an important subject for many, I wanted to end with addressing the AI art we use for our bots.
When I started Chessiverse two years ago, AI art was very new and exciting. OpenAI had not even made their image generation fully public yet, and we used Midjourney to get images that, at the time, felt absolutely remarkable given the effort.
As a small startup with no funding, getting an artist to create five hundred profile and background images was completely out of the question as it would cost thousands and thousands of dollars, even from the cheapest creators. So the options we had were:
- No art at all
- Some generic avatar creator
- AI art
We went with the AI art.
Will we continue with the AI art? For now, that seems likely. If we're aiming to create tens of thousands of bots, it's really the only choice we have. We might look at how it's done to reduce the AI feel. And hopefully we can involve some artists in the process.
In Summary
We've looked at some core things we're doing to create our bots. Making our bots as human as… humanly possible, will always be our top priority, and a work that will most likely never finish. We'll make them better and better, but we're well aware that there's always something we can improve.
If a bot makes a move you feel is unexpected, we're eager to hear about it. You can share them in our Discord or send it to us directly. Like I've mentioned, this is ongoing work, and our top priority, and the more data we get, the easier it is for us to adjust the bots.

