How many queens can you put on a regulation chess board without any of them being in a position to attack another? This question is the foundation of the N-Queens problem, where you ask that question of larger and larger square boards. At a certain size, it becomes a logic problem impossible for an unaided human mind to solve, making it a perfect test case for parallel computing, the hallmark of HPC today.
Computing rookies Ruan Pethiyagoda, Cameron Boehmer, John S. Dvorak, and Tim Sze, all of whom were trained at San Francisco’s Hack Reactor, an institute designed for intense, fast paced learning of programming, put together a program based on the N-Queens algorithm designed by the University of Cambridge’s Martin Richards, and modified it to run in parallel across multiple machines.
“We were able to scale it across every device in the building, including everyone’s laptop, iPhone, Android phone. Even my BlackBerry ran it, which surprised me,” Pethiyagoda said of their project, which they called Smidge.
They then got in touch with the Pivotal Initiative, a big data startup run by EMC and VMware, and managed to run the N-Queens algorithm on a 1000-node Hadoop cluster. Last week, they solved the 27-by-27 version of the problem, setting the world record.
While the N-Queens isn’t exactly one of the pressing scientific issues to be answered through cloud or cluster computing, it still represents an intriguing computational challenge, where the permutations increase exponentially by simply increasing the grid size by one.
Here, the chess board they solved was 27 squares wide and deep. That comes out to 729 squares total, each of which either can or cannot have a queen or not. This means the total possible amount of board configurations comes out to 2 to the power of 729.
Of course, by placing just one queen, one precludes 78 squares from being occupied (26 each from the horizontals, verticals, and diagonals). That both significantly reduces the permutations and adds an intricate and complex layer to the total computation. It is from that point the parallelism that could be replicated in a cloud-based system takes over.
Aside from producing a potentially useful method to the cloud computing paradigm, this news could have an arguably bigger impact on cloud HPC that reaches beyond the pure technical realm. Big data today is experiencing a problem in training and recruiting talent. That comes down to big data being a relatively new phenomenon with the top minds and institutions in the industry still not yet fully understanding the optimal way to survive and utilize the data deluge.
HPC is not new, but trying to set up and run HPC applications in a virtualized setting is from a relative perspective. That a group relatively new to the subject were able to import a parallel problem into a virtualized system is impressive.
Related Articles
IBM’s Guide to Cloud Based HPC
OpenStack and the SDSC Research Cloud
Examining Questions of Virtualization and Security in the Cloud