ESSAYS Mousse 53
Move 37: Artificial Intelligence, Randomness, and Creativity
by John Menick
The world’s top Go player Lee Sedol playing against Google’s artificial intelligence program AlphaGo during the Google DeepMind Challenge Match in Seoul, South Korea, March 2016.
In a recent match between the computer program AlphaGo and the Korean professional Go champion Lee Sedol, the software carried out an unexpected move that took his human challenger off-guard. Was it mere error? John Menick ponders the evolution of artificial intelligence, and the possibility that move thirty-seven was the result of a skillful, beautiful, creativity.
Unlike most areas of scientific study, artificial intelligence (AI) research has led a bipolar existence, alternating between periods of manic ambition and depressive self-loathing. The parabolic history began on a peak in 1956, during a summer conference at Dartmouth, where the founding fathers of AI named their field and outlined its goals. The conference gathered the best names in nascent computer science, including Claude Shannon, Marvin Minsky, and John McCarthy. They promised, given a “two-month, ten-man study,” that they would be able to make “a significant advance” on fundamental AI problems. Those problems included: how computers can use language, how “hypothetical” neurons could be used to form concepts, and the role of self-improvement in computer learning. No significant technical progress was made that summer, and when it came to fundamental issues little more was made during the following decades. To date, AI research has accomplished few of its deeper ambitions, and there is doubt as to whether its modest successes illuminate the workings of even the simplest animal intelligence.
However, when it comes to trivial pursuits—in particular, games—AI has accomplished more, albeit decades later than planned. Computer science can measure advances in gaming in two ways: first, by whether a game has been “solved”—mathematical proofs for predicting optimal outcomes of perfect play—and, second, by playing champions against computer programs. The former technique, though computer-aided, is fundamentally mathematical, while the second is contingent upon the quality of available players. One does not necessitate the other, and while solving games mathematically might be more intellectually rigorous, beating a world champion in a high-profile match gets better publicity. World-class competitive checkers programs, for example, emerged in the 1990s, but the game was only “weakly” solved in 2007. Chess remains partially solved, maybe permanently partially solved, though the best human player, Gary Kasparov, famously lost to IBM’s Deep Blue in 1997.
Garry Kasparov holding his head at the start of the sixth and final match against IBM’s Deep Blue computer, New York, 1997. Photo: © Stan Honda / AFP / Getty Images
This past year, DeepMind, a British artificial intelligence company owned by Google, announced that it had created a computer program, AlphaGo, capable of beating a professional Go player. Go is a 2500-year-old game, and unlike chess, Go was considered too complex for artificial intelligence to master. It also will remain, most likely, unsolved. Go has few rules—around ten in total—played out on a grid of 19 by 19 lines. However, although Go’s rules are simple, it is a much more difficult game to calculate than other board games, both because of the large number of possible moves and the difficulty in determining a piece’s value at any one time. Because of this, Go is often considered a strategic, but intuitive, game. A Go master is able to name a good move and reason about why it is a good move, but the formal, mathematical quantification of that move’s superiority is far more difficult. Chess, with more rules but fewer possible moves, can be more easily calculated and searched by a machine. IBM’s Deep Blue, for example, could search through available moves at the rate of 200 million moves a second. Even given contemporary computing speeds, a Go-playing program with the same search algorithm would be unable to thoroughly evaluate the number of possible moves at any one time. A Go-playing machine would have to be able to play intuitively, something no computer can do; and until the creation of AlphaGo, AI researchers believed a Go-playing program capable of beating a master was impossible.
That was until this last October when AlphaGo beat the European champion, Fan Hui, ranked 663rd in the world. Observers qualified their praise, however: DeepMind’s program was certainly an accomplishment, but a sizeable population of Go champions could, in theory, beat it. A second series of matches was scheduled for the following March in Seoul, South Korea, against the world’s top player, Lee Sedol. While tens of millions watched on television, AlphaGo went on to win the tournament, winning four of the five games. The win represented two firsts in AI history: a major goal was reached far before anyone thought it possible, and human players no longer dominated one of the oldest and highly respected board games.
Go, board and stones. Photo: © Bork / Shutterstock
If a fictional computer scientist were to build a machine capable of writing a novel, she might first break apart the novel into its atomic parts, dividing and subdividing narratives into sentences, sentences into grammar, grammar into phonemes. Our fictional computer scientist—she is, say, a cognitive scientist—would first need to know how natural language works, its rules and structure, its generative capabilities. Simultaneously, she and her team would diagram narrative structure, understanding the simple rules that give rise to larger and more complex forms of storytelling. From there, she would perhaps construct a general anthropology of readership and culture, to understand exactly when a story appeals to an audience. The computer scientist would need teams of various linguists, programmers, anthropologists, historians, literary theorists… a cadre of researchers committed to understanding scientifically what no one has come to understand fully: how literature works.
A second computer scientist would take a different approach. Instead of trying to understand what is impossible to know, the second scientist would build a computer that could learn on its own. The computer would begin as a tabula rasa; it would not understand what a phoneme or a sentence or a novel or a marriage plot is. The only thing it would be programmed to do is to recognize patterns and to classify these patterns. Perhaps it would be guided by human input, or perhaps it wouldn’t need any human assistance. After looking at millions of data points—grammar books, billions of Internet pages, best sellers, magazines—it would form an idea of what an English sentence looks like, and what shape, say, a detective plot might take. Perhaps the machine’s learning would be supplemented by a reward system in which good narratives, award-winning narratives, would be given higher ratings than potboilers. Unlike our first computer scientist, our second scientist has only one task: build a computer capable of learning, of responding to rewards, a silicon version of radical behaviorism in which our artificial novelist learns to write only through repetition and reinforcement.
Deep Blue at IBM headquarters in Armonk, New York. Photo: © Yvonne Hemsey / Getty Images
The methods employed by our two fictional scientists have been, depending on the decade, the most popular techniques in AI research. Symbolic AI, often called “Good Old-Fashioned Artificial Intelligence” (GOFAI) once dominated the field; though, as its longer moniker implies, GOFAI no longer holds the status it once did. In the beginning, the reasoning behind GOFAI was somewhat sound, if oblivious to its own ambition. Since a computer is a universal manipulator of symbols, the reasoning went, and since mental life is involved in the manipulation of symbols, a computer could be constructed that had a mental life. This line of thought led to the creation of byzantine “knowledge bases” and ontological taxonomies, comprised of hierarchies of objects and linguistic symbols. Computers were taught that space had direction and balls were kinds of objects and that objects like balls could be moved in space from left to right. Robots were constructed that move balls, and when some new task was needed of the robot, new ontologies and taxonomies were dreamt up. What didn’t cross anyone’s mind, apparently, was that the amount of symbolic data needing classification was unknown, maybe infinite, and the very classification of this data was socially constructed, i.e. it changed over time and space. Even more devastatingly, although the brain may work somewhat like a silicon-based computer, it is in many instances unlike one, too—especially in its very construction, which is massively parallel and, unlike a computer, is not based on Boolean algebra.
This led scientists to reverse the analogy: instead of believing that brains worked like computers, why not build computers that worked like brains, with silicon “neurons” that worked in parallel and reinforced their own connections over time? This “connectionist” movement rose and fell in the 20th century, but today it is, generally, the most successful of all of fields of AI research. Our second scientist, the behaviorist, belongs to the connectionist movement, and her tool of choice is an artificial neural network. Neural networks are trained on sample data or repeated interactions with a given environment, with either training or reinforcement guiding their learning. Networks can be adapted to particular data without engineers having to know much about the data ahead of time. This makes neutral networks extremely good at tasks involving pattern recognition, for example. Neural networks can be trained to recognize cats in photographs by feeding millions of images of cats and non-cats into the program. Unlike symbolic AI, no engineer has to figure out what makes a cat look like a cat. In fact, an engineer can learn from a machine what makes a cat likeness cat-like. And whereas rule-based (heuristic) searches of possibility trees work well for mastering chess, neural networks work much better for other types of games, including Go.
Atari 2600 video game console “Darth Vader”. Photo: User: Evan-Amos / Wikimedia Commons / CC-0
It’s obvious, then, why neural networks, and machine learning in general, are DeepMind’s specialty. Several years before AlphaGo, DeepMind created a program capable of mastering any Atari 2600 video game. Released in 1977, the Atari 2600 was an extremely popular home video game console, for which hundreds of games, including Pac-Man, Breakout, and Frogger, were released. Before playing, DeepMind’s program knew nothing about any of these games. During play, it was only given screen pixel information, the score, and joystick control. Its only task was to increase the score, and, depending on the game, not be killed. The program did not understand how the controls worked, or what it was controlling. It began playing through trial and error, a gaming marathon lasting thousands of games, with each generation remembering ever better patterns for conquering the program’s new 2D world.
A video shows the DeepMind program playing Breakout, the classic arcade and home video game built by Steve Wozniak and Steve Jobs. The goal of the game is to use a ball to tunnel through a wall occupying the top portion of the screen. The ball drops from the wall to the bottom of the screen, and the player must use a paddle to bounce the ball back up to the wall. If the ball reaches the bottom of the screen, the player’s turn is over. In the first ten minutes, the program’s play is awful, nearly random. The paddle shivers left and right uncontrollably; whenever it connects with the ball, it seems to be accidental. After 200 plays, one hour of training, the computer is playing at a poor level, but it clearly understands the game. The paddle occasionally finds the ball, though not always. At 300 plays, two hours, the program is playing better than most humans. No balls get past the paddle, including angles that seem impossible to reach.
DeepMind researchers could have stopped playing at this point and moved on to a different game, but instead they left the program running. After two more hours of expert play, the program did something none of the researchers predicted: it discovered a novel style of play in which it used the ball to tunnel through one side of the wall. When the ball shot through the newly created tunnel, it ricocheted behind the wall, clearing out the wall from behind. It was an elegant and efficient style of play, one that showed an extreme economy of means. It also was not anticipated by the DeepMind’s engineers; it was an emergent style of the program’s learning.
Google DeepMind’s Deep Q learning playing Atari’s Breakout.
Was this style of play inevitable? If we were to duplicate DeepMind’s Atari player—making two, three, or more—and if we were to then put the programs to work in parallel, would each discover the same tunneling strategy? Is tunneling a creative strategy or is it the result of a search for optimal play? Would two or three different programs evolve different styles of play, or would their play always be honed down to the most efficient play? In a simple game like Breakout, it is easy to imagine that efficiency, and not creative style, would define expert play. The program is simply trying to find a global maximum, i.e. the best score with the least amount of effort.
In an extremely complex and aesthetic game like Go, however, expert creativity and style are essential to grandmaster play. In order to achieve this, DeepMind used a combination of neural networks and tree search algorithms, the latter being more traditional to gaming AI. The training period began with processing 30 million moves of human gameplay, but then concluded with AlphaGo playing thousands of games against itself. The latter period meant that AlphaGo could develop its own style of play and would not rely simply on regurgitating moves from a database.
This was most evident during the second match between Sedol and AlphaGo, when, on move 37, the computer made a very unusual move. There had been several odd moves in the tournament until that point, but move 37 caused one of the English-language live commentators, Michael Redmond, a top Go player, to squint in disbelief at the match’s live video feed.
“That’s a very surprising move”
he said, his head pivoting back and forth between the board and the monitor, making sure he did not get it wrong. At that moment, a clearly rattled Sedol left the room, returning a few minutes later, and taking 15 minutes for his next move. The commentator was clearly impressed by AlphaGo’s move, and did not see it as a mistake. Speaking to his co-commentator, he said:
“I would be a bit thrown off by some unusual moves that AlphaGo has played…. It’s playing moves that are definitely not usual moves. They’re not moves that would have a high percentage of moves in its database. So it’s coming up with the moves on its own… It’s a creative move.”
AlphaGo is not self-aware. Its intelligence is not general purpose. It cannot pick up a Go stone. It cannot explain its decisions. Despite these limitations, AlphaGo’s moves are not only competent, but are also intimidatingly original. Paradoxically, however, AlphaGo is at its most intimidating when it makes a mistake. It is unknown whether move 37 was a poor move, software error, a rare misstep in an otherwise perfect game. Conversely, it could have been an example of superhuman play, a brilliant insight made by a machine unrestricted by social convention. Or maybe the move was intentionally random. Perhaps AlphaGo, a program with no psychology, picked up a secondhand understanding of psychological warfare. AlphaGo could never know that its move would cause Sedol to leave the room, but the machine might have understood what effect the move would have on its opponent’s play.
There is a precedent for this kind of unsettling move. At the end of the first 1997 match between Deep Blue and Gary Kasparov, the computer’s 44th move placed its rook into an unexpected position. Deep Blue went on to lose the game, but the move was so counterintuitive—bad, perhaps—that Kasparov spent the next game thinking Deep Blue understood something he did not. Distracted, Kasparov lost the second game. Later interviews by statistician Nate Silver revealed that the move was the result of a software bug. Not able to choose a move, Deep Blue defaulted to one that was purely random. Kasparov worried about the move’s superiority, when, in reality, it was the result of nothing but chance.
In his book The Signal and the Noise, Silver argued that Deep Blue’s random move should not be considered creative. He contrasted the move with Bobby Fischer’s knight and queen sacrifices made during his 1956 match with Donald Byrne (a.k.a. “The Game of the Century”). Fischer, thirteen years old at the time, cunningly tricked Byrne into taking his two important pieces, luring the older master into a trap unprecedented in the history of chess. No computer, argued Silver, would have made such sacrifices. Silver is right: the heuristics of computer chess is biased toward conservative moves. A computer would have no need to be as daring and flashy as the young, brilliant Fischer. But AlphaGo is not using simple heuristics. Its neural networks were built not to repeat past plays, but to create new strategies. This doesn’t leave out the possibility for a bad move, but it reduces the chance of pure randomness, as was the case with Kasparov. In other words, AlphaGo’s usage of a neural network increases the likelihood that the 37th move was novel and creative, albeit perhaps not a work of brilliance.
It’s tempting to think that Sedol’s anxiety, like Kasparov’s, was caused by his inability to distinguish between brilliance and error. The commentators took the move very seriously, even though one initially admitted,
“I thought it was a mistake.”
The world’s top Go player Lee Sedol (on the right) playing against Google’s artificial intelligence program AlphaGo during the Google DeepMind Challenge Match in Seoul, South Korea, 10 March 2016. Photo: © Handout / Getty Images
The fact that AlphaGo built a winning game around the position was used as proof that the move was creative and skillful, but this could be a post-hoc rationalization. An alternative explanation would be that AlphaGo made a poor move, but recovered its gameplay subsequently, smoothing over the mistake with superior play that caused, in hindsight, the move to look better than it was.
No matter what the reality, move 37 condenses the complex undecidability of machine creativity into single moment. But what if the problem is more decidable than it appears? What if we were to say that an author, or a living being, is not needed for creativity, and that the real problem is our insistence on the primacy of an author? The position is not as absurd as it might sound, especially when one considers that Darwinian natural selection is purposeless, directionless, and creatorless, yet it is the most creative force known to science. Natural selection is what got us here; it has even created the brains of the computer scientists that designed AlphaGo. So if intentionality is not required for formal creativity, and given that natural selection is the source of human intelligence, then artificial creativity can be seen not as an artificial copy of true creativity, but as a continuation of natural selection’s creativity by other means. Put another way, “mistakes” in nature—mutations—are what drive evolution, and just as adaptations come about through happenstance, it might not matter whether a winning move by AlphaGo is accidental or not. All that matters is that the move worked. And just as machine learning preserves past successes—accidental or not—in memory, natural selection registers past success through genetic inheritance, building complex organisms out of millions of lucky mistakes. Thus evolution, contrary to many popular assumptions, is not a purely random process, but one that uses information storage as a bulwark against circumstance, keeping the good accidents and discarding the bad. In this way, random software errors or examples of surprisingly bad play that nonetheless work can be seen as operating like genetic mutation and drift: as long as an error works to a program’s advantage within a particular context, it does not matter whether it was intentional.
But perhaps the best way to understand mechanic creativity is through its strangeness; like natural selection, artificial creativity is perhaps at its strongest when its works are the least familiar. As Go master Fan Hui said about move 37:
“It’s not a human move. I’ve never seen a human play this move. So beautiful.”
John Menick is an artist, writer, and computer programmer. He lives in New York.
Originally published on Mousse 53 (April–May 2016)