One scorching summer's day two weeks ago, I was sitting in a cavernous room in the Mojave Desert with about 2,000 other people I'd never met. There was more than $ 8 million in cash in a secure spot somewhere on the compound, which we and others like us had paid for privilege in nearby rooms. That day we sat 14 hours. From time to time, one of us got up quietly and left to never return. The last survivor of us would immediately become a millionaire.
We played poker. And without my knowledge, two Intel processors on the other side of the country had recently undergone similar testing. At the World Series of Poker crescendo in Las Vegas, two computer scientists announced that they have created an artificial intelligence poker player that is more powerful in the most popular form of the game than a crowded table with top human professionals. no-limit Texas Hold'em.
Noam Brown, a researcher on Facebook AI Research, and Tuomas Sandholm, a computer scientist with Carnegie Mellon, describe their findings in a new publication titled "Superhuman AI for Multiplayer Poker" today in the journal Science.
Over the past few decades, artificial intelligence has outdone the best humans in many of our species' beloved games: controllers and their long-term planning, chess and its iconic strategy, Go and its complexity, backgammon and its element of chance, and now poker and his incomplete information. Ask the researchers who worked on these projects why they are doing this, and they will tell you one thing: games are a test bench. In games, techniques are tested, results measured and machines compared to humans. And with each game comes an extra layer that more accurately models the real world. The real world requires planning, requires strategy, is complex, random, and, perhaps most annoyingly, contains innumerable hidden information.
"No other popular leisure game takes the challenges of hidden information as effectively and as elegantly as poker," write Brown and Sandholm.
Over the last nine months, I've been working on a book on the collision of games and AI ̵
Poker was one of the last limits of these popular games due to its complexity and the fact that players hide important information from each other, and that limit is set quickly. The computer's conquest of poker was incremental, and most of the work so far focused on the relatively simple heads-up or two-player version of the game.
In 2007 and 2008 computers were directed by A program called Polaris showed promising results in the early man-machine matches and fought on an equal footing with human pros in heads-up-limit (19459004) (19459005) two players limited to specific players are fixed bet sizes.
In 2015, the head-up limit for Hold'em was essentially "solved" thanks to an AI player named Cepheus. This meant that Cepheus's game could not be distinguished from perfection, even after being observed for a lifetime.
In 2017, at a casino in Pittsburgh, a quartet of human professionals opposed a program called Libratus in the incredibly complex arena at Heads-up No-Limit Hold Them. The human professionals were destroyed without further ado. At about the same time, another program, DeepStack, claimed the superiority of human pros in heads-up no-limit.
In 2019, Wired reported that the game-theoretic technology behind Libratus was used in the service of the US military. in the form of a two-year contract of up to $ 10 million with a Pentagon agency called Defense Innovation Unit.
Brown and Sandholm's latest creation called Pluribus is superhuman and plays no-limit poker with more than two players – more precisely six – which is identical to one of the most popular forms of the online game and the Game that I played in this room in the desert is very similar.  In an important early stage game theory paper from 1951, one of the fathers of the subject, John Nash, examined an ultra-simplified version of poker, calling the game the "most obvious goal" for applications of his theory. "The analysis of a more realistic poker game than our very simple model might be interesting," he wrote. He predicted that the analysis would be complicated and that calculation methods would be required. He was right.
Like other superhuman AI players, Pluribus learned how to play poker by playing eight days and 12,400 CPU core hours against himself. It starts to play randomly. It is observed what works and what does not. And it changes its approach by using an algorithm that targets Nash's equilibrium of the same name. This process created his attack plan for the entire game, the so-called "blueprint strategy", which was calculated offline before the competition. The authors estimate that the current cloud computing costs would be only $ 144. During his competitive games, Pluribus is looking for improvements in his rough design in real time.
The completed program, which ran on only a few Intel CPUs, competed against top human players, each of whom had won at least $ 1 million as a pro – in two experiments on thousands of hands: one with one Copy of Pluribus and five people and one with a human and five copies of Pluribus. People were paid by the hand and also encouraged to do their best with cash from Facebook. Pluribus proved to be profitable and of statistical significance in both experiments, worth publishing in Science.
"I think this was the last milestone in poker," Brown said. "I think poker has served its purpose as a benchmark and AI challenge challenge."
"I'm probably more experienced in fighting top notch poker AI systems than any other poker professional in the world," said Jason Les, one of Pluribuss's opponents. "I know all the places to look for vulnerabilities, all the tricks to exploit the shortcomings of a computer. In this competition, AI had a solid, game-theoretically-optimal strategy that is really only seen by high-profile human experts, and despite my best efforts, I did not succeed in finding a way to exploit them. I would not want to play in a poker game where this AI poker bot sat at the table. "
Sandholm and Brown told me that they expect Pluribus technology to have even more extensive applications than previous bots. They think Pluribus is the first multiplayer as in more than two, AI gaming milestones, and that this could affect a number of multiplayer games in the real world: auction bids, multi-party Negotiating prices for online retailers, advertising for presidential candidates, cyber security and even self-driving cars.
In this cavernous space in the desert at the World Series of Poker in Vegas, people did not think of political ads or self-driving cars, but many of them reflected on game theory. The best pros of the game are increasingly geared to academic AI literature, to commercially available programs such as PokerSnowie and PioSOLVER, as well as to doctoral students who hire them as consultants to improve their games. As a result, the quality of human poker has never been so high and Pluribus may even further increase it.
But I have spoken to both professionals and scientists who believe that poker AI can kill just the game they want to capture. In fact, you could have already killed heads-up limit. On the one hand, these skeptics argue that modern elite poker can feel sterile when young professionals make the best games behind sunglasses and under headphones. The game lacks the dedicated human characters it needs to show a good show and attract a new generation. On the other hand, poker is like a pyramid scheme: it requires a wide range of skills to support the pros who are playing for money at the top. When people learn quickly from the bots, they all become good, the skills flatten, the pyramid falls down, and the game dies.
"Unfortunately, this can have some advantage," Sandholm said. "That would be very sad. I came to love this game. "