In the race to solve Rubik’s Cube, the time-to-finish keeps shrinking. This year Philipp Weyer from Germany won the 10th World Cube Association (WCA) Championship held in Melbourne, Australia, with a 6.74-second performance. Not bad for a human. He was named champion on July 14. Literally one day later a research team from UC Irvine published in Nature Machine Intelligence a description of their latest AI-based Rubik solver, DeepCubeA, and its ability to solve Rubik’s Cube in about a second.
We humans have to step up our game.
DeepCubeA, you may know, is the successor to DeepCube sans suffix and developed by the same UCI team. Way back in 2018 their then-new algorithm, more prosaically named Autodidatic Iteration (DeepCube), was “able to solve 100% of randomly scrambled cubes while achieving a median solve length of 30 moves — less than or equal to solvers that employ human domain knowledge.” HPCwire duly chronicled DeepCube’s arrival (New Deep Learning Algorithm Solves Rubik’s Cube).
Rubik’s Cube is a member of a class of problems whose solution has proven difficult for deep reinforcement learning because there are a large number of states and only one reward state. In this instance, Rubik’s Cube has a large state space, with approximately 4.3 × 1019 different possible configurations. The lack of many ‘reward states’ makes it difficult to develop a solving strategy. DeepCube (version one) represented an advance in DRI and was used in conjunction with Monte Carlo Tree Search (MCTS).
The new ‘champ’ ditches MCTS for something called weighted A* search. There’s a nice account of the recent work in Forbes. One important element of both solvers is their ability find the solution without domain knowledge or human coaching. Before going further if you want to test yourself by solving Rubik’s Cube, check out this online exercise posted by the UCI researchers: http://deepcube.igb.uci.edu
Here’s a description from the latest paper by Forest Agostinelli, Stephen McAleer, Alexander Shmakov, and Pierre Baldi, all of UCI:
“DeepCubeA builds on DeepCube, a deep reinforcement learning algorithm that solves the Rubik’s cube using a policy and value function combined with Monte Carlo tree search (MCTS). MCTS, combined with a policy and value function, is also used by AlphaZero, which learns to beat the best existing programs in chess, Go and shogi. In practice, we find that, for combinatorial puzzles, MCTS has relatively long runtimes and often produces solutions many moves longer than the length of a shortest path. In contrast, DeepCubeA finds a shortest path to the goal for puzzles for which a shortest path is computationally verifiable: 60.3% of the time for the Rubik’s cube and over 90% of the time for the 15 puzzle, 24 puzzle and Lights Out…
“…We use a variant of A* search called weighted A* search. Weighted A* search trades potentially longer solutions for potentially less memory usage by using, instead, the function f(x)= λg(x) + h(x), where λ is a weighting factor between zero and one. Furthermore, using a computationally expensive model for the heuristic function h(x), such as a DNN, could result in an intractably slow solver. However, h(x) can be computed for many nodes in parallel by expanding the N lowest cost nodes at each iteration. We call the combination of A* search with a path-cost coefficient λ and batch sizeN ‘batch weighted A* search’ (BWAS).”
Dig into the paper for more details, but I rather like characterization presented by the Forbes article, written by Jennifer Kite-Powell, and Baldi’s response:
“According to Baldi, the solution to the Rubik’s Cube involves more symbolic, mathematical and abstract thinking and so a deep learning machine which can crack a puzzle like the Rubik’s Cube is getting closer to becoming a system that can think, reason, plan and make decisions.
“These characteristics are shared by many other problems in robotics and other domains that require some kind of planning,” added Baldi. “Imagine a robot tasked with cleaning up your kitchen: there is an astronomical number of sequences of moves, but only very few lead to a clean kitchen. And randomly moving dirty dishes around is not going to do it.”
The researchers write, “The generality of the core algorithm suggests that it may have applications beyond combinatorial puzzles, as problems with large state spaces and few goal states are not rare in planning, robotics and the natural sciences.”
For the humans among us the WCA championship is held every two years – there’s plenty of time to sharpen their cube-solving skills. The categories can be interesting. Besides the standard variations of the grid size (3x3x3, 2x2x2, 5x5x5 and more) there’s blindfolded, one-handed, with your feet, and more.
Link to Nature Machine Intelligence paper: https://www.nature.com/articles/s42256-019-0070-z
Link to Forbesarticle by Jennifer Kite-Powell: https://www.forbes.com/sites/jenniferhicks/2019/07/15/this-ai-can-solve-a-rubiks-cube-super-fast/#7c8ff28e6f5c
Link to CMU posted online exercise: http://deepcube.igb.uci.edu