DeepMind: The Hanabi Card Game Is the Next Frontier for AI Research

Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. Now get this, after defeating Chess, Go, and
making incredible progress in Starcraft 2, scientists at DeepMind just published a paper
where they claim that Hanabi is the next frontier in AI research. And we shall stop …right here. I hear you asking me, Károly, after defeating
all of these immensely difficult games, now, you are trying to tell me that somehow this
silly card game is the next step? Yes, that’s exactly what I am saying. Let me explain. Hanabi is a card game where two to five players
cooperate to build five card sequences and to do that, they are only allowed to exchange
very little information. This is also an imperfect information game,
which means the players don’t have all the knowledge available needed to make a good
decision. They have to work with what they have and
try to infer the rest. For instance, Poker is also an imperfect information
game because we don’t see the cards of the other players and the game revolves around
our guesses as to what they might have. In Hanabi, interestingly, it is the other
way around, so we see the cards of the other players, but not our own ones. The players have to work around this limitation
by relying on each other and working out communication protocols and infer intent in order to win
the game. Like in many of the best games, these simple
rules conceal a vast array of strategies, all of which are extremely hard to teach to
current learning algorithms. In the paper, a free and open source system
is proposed to facilitate further research works and assess the performance of currently
existing techniques. The difficulty level of this game can also
be made easier or harder at will from both inside and outside the game. And by inside I mean that we can set parameters
like the number of allowed mistakes that can be made before the game is considered lost. The outside part means that two main game
settings are proposed: one, self-play, this is the easier case where the AI plays with
copies of itself, therefore it knows quite a bit about its teammates, and two, ad-hoc
teams can also be constructed, which means that a set of agents need to cooperate that
are not familiar with each other. This is immensely difficult. When I looked the paper, I expected that as
we have many powerful learning algorithms, they would rip through this challenge with
ease, but surprisingly, I found out that that even the easier self-play variant severely
underperforms compared to the best human players and handcrafted bots. There is plenty of work to be done here, and
luckily, you can also run it yourself at home and train some of these agents on a consumer
graphics card. Note that it is possible to create a handcrafted
program that plays this game well, as we, humans already know good strategies, however,
this project is about getting several instances of an AI to learn new ways to communicate
with each other effectively. Again, the goal is not to get a computer program
that plays Hanabi well, the goal is to get an AI to learn to communicate effectively
and work together towards a common goal. Much like Chess, Starcraft 2 and DOTA, Hanabi
is still a proxy to be used for measuring progress in AI research. Nobody wants to spend millions of dollars
to play card games at work, so the final goal of DeepMind is to reuse this algorithm for
other applications where even we, humans falter. I have included some more materials on this
game in the video description, make sure to have a look. Thanks for watching and for your generous
support, and I’ll see you next time!

Posts created 5132

23 thoughts on “DeepMind: The Hanabi Card Game Is the Next Frontier for AI Research

  1. I just saw bits and pieces of Googe's Stadia presentation and I'm surprised to already see a realtime application of style transfer in a big product.

  2. Finally! I've been following AI research for a long time, and I'm also really into boardgames. I've always known that boardgame (type games) are the deepest type of learning as it's the only area which I myself can still grow and learn in. The complexities between working with other free agents (humans in our case) is just unmatchable! Glad they are in a place to finally start cracking these challenges! Also AI research is finally moving into an area I know extensively! This game in particular is perfect as a test bed, I highly recommend people try it out to understand the subtle complexities in it! If you find it too easy, you need to play against better players!!! We're really getting close to true AI now! 🙂

  3. The consensus is that they didn't do anything that impressive with Starcraft. It didn't have the same limitations players did, and it was only able to run a single strategy. It won by performing more actions per second than a human player would ever be able to match. It also performed actions in multiple places on the map at the same time, because it wasn't limited to the same window that human players are. At best it was a gimmick, and was in no way as profound or impressive as the victory at Go.

  4. Oh man your videos are great, but that "mute" audio at the beginning of each new sentence is difficult to bear. I hope your audio software settings get revised on your next videos. Thanks.

  5. You guys heard it first here, in Youtube. DeepMind is done with SC2 scrubs and now it wants blood in a much harder game, in Hanabi.
    The weirdest board game no one ever played but somehow got a board game of the year award.
    Next in line are, Munchkin, Exploding Kittens, Pathfinder, Rust and Minecraft.

  6. This tells me when DeepMind was playing StarCraft, it had perfect information of the map whereas human players could only see a limited field of view through the screen. Cheater.

  7. Two Minute Papers viewing algorythem:

    1. Thumbs up
    2. View video
    3. Click on another three 2MP links and open in new tab
    4. Goto 1.

  8. Could aversarial neural network work as well where the adversary for player agents is the card shuffler who tries to make sure that as many hands as possible would lead the players into loss? Obviously, there are some combinations that always lead to loss but what about those that aren't?

  9. Game Theory scenarios can get hugely complex to compute. I'm a little scared of what we might interpret from the results of machine learning and game theory.

  10. I can't believe it! I am working now privately for about half a year on the exact same problem. Unfortunately, my neural network does not perform better than 6 points on average which is still quite dumb. My experience is that it is somehow difficult for the AI to get the sweet spot between playing risky and playing super safe. I often found that after training, the AI would often discard cards like 90% of the time even if it does not make any sense just to end the game without making mistakes… The other extreme is playing cards way too early and loosing 90% of the time…

  11. So now we’re teaching AI to communicate with other AI using imperfect information. I’m studying AI and am all for it, but this is the one that scares me as existential threat.

  12. It's true, though – moving from perfect information games to imperfect information games is a HUGE step forward for AI. There's also an issue inherent to card games that makes them several orders of magnitude harder to cope with than a game like chess – randomness. Every game of chess starts off in exactly the same way. With card games, the possibilities are almost literally infinite.
    And collectible card games would be another entire order of magnitude above that…

  13. Possible to simply train 4 different models and make them train together? Would reduce the knowledge each model has of its opponents.

  14. "stupid card game"
    Nah bro I've played hanabi, it's brutal, and if an AI can ever consistently do well at it then I'll be damn impressed.

  15. I appreciate that you said "Great progress in Starcraft II" instead of having "Beaten" Starcraft II. Alpha Star was deficient by several metrics.

  16. I feel like these videos are always missing some key piece of information to make them truly great. For example, most people are probably looking up the rules to Hanabi after watching this, which means that all the video accomplished was saying "This work exists", which is not very useful.

Leave a Reply

Your email address will not be published. Required fields are marked *

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top