AI Playing Games: Crash Course AI #12

Jabril: D7
John-Green-bot: Miss. John-Green-bot: I-9
Jabril: NOOOOOOOOOO Hey, I’m Jabril and welcome to Crash Course
AI! John-Green-bot might struggle with some things
like natural language processing and moving through the real world, but using AI he’s
pretty good at board games. AI researchers spend a LOT of time trying
to teach AI how to beat humans at games, & this isn’t just because games are fun. Games provide constrained scenarios for testing
new AI algorithms and approaches. In a game, it’s easy to know whether the
AI is winning or losing, because there’s usually a score or some objective measure
of “winning.” This is great because AI learns from examples,
trying things out, and slowly improving. Games basically provide their own training
data, which is a big relief because AI systems need lots of training data to get really good. An AI can even play against itself to generate
training data and evolve better game strategies. Or an AI can be programmed to look at previous
games (even games played by expert humans) for strategies that lead to victory. Comparing AIs against expert human gamers
can also help us figure out how an AI is improving over time. This comparison also gives us a sense of the
difficulty of problems an AI can solve. In other words, if I can teach John-Green-Bot
to beat me at battleship, I can probably also teach him to beat me at any game that’s
simpler than battleship. Finally, games are cool (and that’s important
too!). Sometimes AI programming can feel a bit difficult
or dry because of all the math and troubleshooting, and games can provide a fun motivation to
learn all this stuff. This is the reason why some of my first AI
demos were games. For all these reasons, games and computing
have a rich history that’s worth diving into. For that, let’s go to the Thought Bubble. Humans have always been fascinated by the
idea that machines could beat us at our own strategy games. In 1770, the inventor Wolfgang von Kempelen
revealed an invention to the Empress Maria Theresa of Austria. The Mechanical Turk was a clock-like contraption
that appeared to play chess. Chess pieces were attached to rods that were
hooked into a bathtub-sized box. After Empress Maria made a move on the board,
she turned a crank that activated the machine, which would move chess pieces mechanically. To her surprise, the machine was able to beat
most challengers. However, it was an elaborate hoax and The
Mechanical Turk was actually controlled by a person hidden inside! Getting a machine to play chess is actually
really complicated. So when AI researchers first tried to tackle
the problem in the late 1940s and early 1950s, they focused on simpler chess situations. Like, games with only a few pieces remaining
on the board or full games played on a small 6×6 board without any bishops. At the same time, researchers worked on an
AI that could play checkers, because checkers looked easier… although it was really almost
as complicated. The first program to play a legitimate game
of checkers was built by IBM in 1956. And, in a classic cold war move, two programs
that could play a full chess game were developed in parallel by the US and Russia in 1957. But these programs didn’t get /good/ for
another 40 years. Checkers was first, with a program called
Chinook which started dominating masters in 1995. Chess followed when a computer called Deep
Blue beat the chessmaster Garry Kasparov in 1997. Thanks Thought Bubble. Since then, strategy games have been mastered
one-by-one, with the most recent victories over humans at Go in 2017, DOTA 2 in 2018,
and Starcraft II in 2019. Okay, so the best way to understand the difficulty
of teaching AI to play games is through an example. Oh John Green Bot. So let’s start with a really simple goal:
teaching John-Green-bot to play tic-tac-toe. One of the ways that we can think about playing
tic-tac-toe is as a tree with all the possible moves from any given state of what the game
board looks like. For example, if this is the current game state,
it’s John-Green-bot’s turn, and he’s using Xs… there are three places he can
go. We can draw a tree representing possible outcomes
for each of these options, and all of the options his opponent (me, or anyone else)
can take: Because computers think with numbers, each
outcome can be assigned a reward — a number like a 1 for a win, and -1 for a loss or tie. Basically, John-Green-bot will need to search
through the tree of possibilities to find his win. To decide which choice to make, John-Green-bot
will assume that in each tree, both he AND his opponent will make the best possible choices. In other words, his policy (or his framework
for making decisions) will alternate between choosing the branch that will maximize the
outcome of winning on his turn, and minimize the outcome of his opponent winning on their
turn. This is called the minimax algorithm. Then, each game state can be assigned a value
based on how likely it leads to John-Green-bot winning, and he can decide which move to make
based on his policy. Looking at this tree, John-Green-bot will
always pick option 1.0, and win the game! Of course, this was a pretty small tree because
we were looking at a game in progress. To draw the whole tic-tac-toe tree from beginning
to end, we would need to represent about 250,000 boards. That seems like a lot, but it would take like
a half a second for a powerful modern computer to compute this many options. By laying out all the possibilities and taking
the paths that led to a win, John-Green-bot can solve tic-tac-toe. This means that John-Green-bot will always
achieve the best possible outcome, either a win or a tie, no matter how his opponent
plays. Thanks John Green Bot. But we can’t solve all games this way. Checkers, for example, has about 10 to the
20th power board states… or 10 followed by 20 zeros. That’s more board states than there are
grains of sand on Earth! Chess has 10 to the 50th power board states. And Go has 10 to the 250th power board states. To put those huge numbers into perspective,
there are only 10 to the 80th atoms in the entire known universe! Computer scientists have theorized that it
would be impossible for conventional computers to calculate this many states due to the laws
of physics. Like, for example, if you combined all planets
and stars and everything in the whole universe into a single supercomputer, it still wouldn’t
be powerful enough to solve the game of Go. But some people have hope that quantum computers
may be able to get there one day… So, if figuring out all of the board states
could be mathematically impossible, how did computers beat the number one ranked human
masters in Chess and Go? Many modern systems, including Google’s
AlphaGo computer that beat a human master in Go in 2017, use an algorithm called Monte
Carlo Tree Search. Monte Carlo is a famous casino, so whenever
you see the term “monte carlo,” it’s a good bet that the algorithm will be using
randomness and chance to solve a problem. Combining Monte Carlo randomness and regular
tree search like minimax, modern game AIs decide which part of the huge tree to search
by guessing at odds. Basically, they want higher odds that the
part of the game tree they search will lead to a win. But these aren’t just random guesses like
we would make in many casino games, AI systems can simulate millions of “what-if” scenarios
and use math to estimate the likelihood of winning if they choose one path or another. In each “what-if” scenario, the AI considers
making one particular move and then simulates playing a large number of (but not all) possible
games, where the next moves are chosen at random. By averaging these possible outcomes, the
AI estimates how “good” that particular move is. It’s so much faster to estimate a handful
of choices than exhaustively calculate each branch of the game tree. And some computers can even do this estimation
in real time. One example of this is Google’s DeepMind
which defeated human professional players at Starcraft II in 2019 — where time is
very critical. Of course, Starcraft II, Go, and Tic-Tac-Toe
aren’t all the types of games that humans play. Other games require other strategies and have
other computational challenges: IBM’s Watson question-answering system was
able to beat human Jeopardy! champions in two televised matches in 2011. Watson listened for keywords in the clue and
tried to use a knowledge graph to figure out responses. And we’ll talk more about knowledge graphs
in a future episode. Watson wasn’t perfect and struggled a bit
with context. For example, it famously guessed “What is
Toronto?” on something in the category “US Cities.” But Watson was still able to do better than
human contestants overall. Evolutionary neural networks use the environment
as an input, like reinforcement learning. But this approach introduces /multiple/ agents
who try multiple neural network structures, and then build on successful ones for the
next generation. Sort of like animals, the ones that are better
at surviving get to pass on their genes. For example, the AI MarI/O can learn how to
play a Super Mario World level by telling MarI/O what buttons it can push and that getting
farther to the “right” in the level is good. This AI will start by basically mashing buttons
at random, but as some button mashes get it farther to the right, it remembers and learns
from those successful attempts. In the next lab, we’ll build our own AI
to use this approach to /crush/ a video game that we built where John-Green bot destroys
trash. So, are there any games that are safe to play,
where humans will always have an edge and AI won’t be able to beat us? Computers definitely seem to struggle with
parts of language like humor, irony, metaphor, and wordplay. Computers also aren’t great at understanding
and predicting real people, who don’t always act “optimally,” so social games could
be more difficult too. But AI systems are finding some success in
bluffing games like online poker, so it’s important not to underestimate them. John Green Bot: All – in. Computers might also struggle with creativity
or surprise, because there’s not a really clear way to assign values to states. It’s difficult to assign a number to “how
creative” something is, compared to saying “go as far right as you can in the Mario
level” or “achieve a winning state in a chess game.” So, considering all of that, maybe games like
charades would be pretty stacked for a human victory. Or… what about pictionary? Hide-and seek??? We’d love to hear in the comments what games
you think are safe from AI. But in the next episode, which is another
lab, we’ll program an AI system to learn how to play an arcade-like game… and I’ll
beg John Green bot for my poker money back. I’ll see ya then Crash Course is produced in association with
PBS Digital Studios. If you want to help keep Crash Course free
for everyone, forever, you can join our community on Patreon. And if you want to learn more about the history
of games, we have a whole series about that.

Posts created 5600

73 thoughts on “AI Playing Games: Crash Course AI #12

  1. I recently heard Magic the Gathering couldn't be cracked by a computer, something to do with the number of unique cards and the complexity of the rules.

  2. You should see the Oxford's paper on a CNN that can play geometry dash.There are students in university that work to make an AI that can't even beat stereo madness lol! And it's a great video as always.

  3. I would love to see an AI win at Mao. It would need serious NLP capabilities to fully translate penalty cards into a rules engine. I'm sure it could be done, but man would it be fun to see that happen.

  4. The Starcraft victory was proven to be a cheat, with the machine using superhuman reflexes to win. Even without that fixed, in a rematch against the same winning strategies, the human was victorious (because he quickly learned the AI's vulnerability.)

  5. That 250,000 boards number doesn't consider symmetry. It can be reduced enormously. There are 3 opening moves, not 9. There are 5, 5, or 2 replies by player two. After those, there are only 66 total third moves, and even some of those can be pruned as either symmetrical or identical when considering the fourth move.

  6. I really like this series. Sometimes crash course feels a little dry to me, but I really like this guy, and I actually feel like I'm genuinely learning

  7. I would always fake a miss and lie, then move my battle ships every time my opponent hit one until I didn't have any more space to put them on the board. I would leave my smallest ships to the last because they were such small targets and I could move them easily.

  8. Love the channel, everything is really well-researched and beautifully presented! Really inspired me when I wanted to start my own. I just reached 100 subs, and want to grow even more. I recently uploaded a vid about the enlightenment, feel free to check it out (it's under two minutes)

  9. An interesting video but just one point about the number of possible games of tick tack toe. You can reduce the number significantly by using symmetry. The first move for example can be reduced down from 9 moves to just 3. The centre, the middle of any row/column and any corner. All other moves can be created using rotation and/or reflection.

  10. Hey, have you heard of the legendary difficulty known as Realjoel's Dad? They say you will never with that difficulty.

Leave a Reply

Your email address will not be published. Required fields are marked *

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top