December 6, 2018
DeepMind’s artificial intelligence programme AlphaZero is now showing signs of human-like intuition and creativity, in what developers have hailed as ‘turning point’ in history.
The computer system amazed the world last year when it mastered the game of chess from scratch within just four hours, despite not being programmed how to win.
But now, after a year of testing and analysis by chess grandmasters, the machine has developed a new style of play unlike anything ever seen before, suggesting the programme is now improvising like a human.
Unlike the world’s best chess machine – Stockfish – which calculates millions of possible outcomes as it plays, AlphaZero learns from its past successes and failures, making its moves based on, a ‘nebulous sense that it is all going to work out in the long run,’ according to experts at DeepMind.
When AlphaZero was pitted against Stockfish in 1,000 games, it lost just six, winning convincingly 155 times, and drawing the remaining bouts.
Yet it was the way that it played that has amazed developers. While chess computers predominately like to hold on to their pieces, AlphaZero readily sacrificed its soldiers for a better position in the skirmish.
Speaking to The Telegraph, Prof David Silver, who leads the reinforcement learning research group at DeepMind said: “It’s got a very subtle sense of intuition which helps it balance out all the different factors.
“It’s got a neural network with millions of different tunable parameters, each learning its own rules of what is good in chess, and when you put them all together you have something that expresses, in quite a brain-like way, our human ability to glance at a position and say ‘ah ha this is the right thing to do’.
“My personal belief is that we’ve seen something of turning point where we’re starting to understand that many abilities, like intuition and creativity, that we previously thought were in the domain only of the human mind, are actually accessible to machine intelligence as well. And I think that’s a really exciting moment in history.”
AlphaZero started as a ‘tabula rasa’ or blank slate system, programmed with only the basic rules of chess and learned to win by playing millions of games against itself in a process of trial and error known as reinforcement learning.
It is the same way the human brain learns, adjusting tactics based on a previous win or loss, which allows it to search just 60 thousand positions per second, compared to the roughly 60 million of Stockfish.
Within just a few hours the programme had independently discovered and played common human openings and strategies before moving on to develop its own ideas, such as quickly swarming around the opponent’s king and placing far less value on individual pieces.
The new style of play has been analysed Chess Grandmaster Matthew Sadler and Women’s International Master Natasha Regan, who say it unlike any traditional chess engine.
”It’s like discovering the secret notebooks of some great player from the past,” said Sadler.
Regan added: “It was fascinating to see how AlphaZero’s analysis differed from that of top chess engines and even top Grandmaster play. AlphaZero could be a powerful teaching tool for the whole community.”
Garry Kasparov, former World Chess Champion, who famously lost to chess machine Deep Blue in 1997, said: “Instead of processing human instructions and knowledge at tremendous speed, as all previous chess machines, AlphaZero generates its own knowledge.
“It plays with a very dynamic style, much like my own.The implications go far beyond my beloved chessboard.”
The new analysis was published yesterday in the journal Science, and the DeepMind team are now hoping to use their system to help solve real world problems, such as why proteins become misfolded in diseases such as Parkinson’s and Alzheimer’s.
The new results suggest that it could come up with new solutions that humans might miss or take far longer to discover.
DeepMind CEO and co-founder Demis Hassabis said: “The reason that tabula rasa was important is because we want this to be as general as possible. The more general it is across the games the more likely it will be able to transfer to real world problems. Like protein folding.
“Protein folding has always been our number one target. I’ve had that in mind for a long time, because its a huge problem in biology and it will unlock a lot of other things like drug discovery.
“In chess AlphaZero works not because it’s looking further ahead but because it understands the position better. It’s generalising from past experience. It’s almost like intuition in the same way a human grandmaster would think about it, it’s evaluation of the current situation is better. And if you’re evaluation is better then you don’t have to do as much calculation.”
Prof Silver added: “Historically there has been this amazing mismatch between the things that humans can do and the things that computers can do.
“With the advent of powerful machine learning techniques we’ve seen that the scales have started to tip and now we have computer algorithms that are able to do these very human-like activities really well.”