HPCwire

The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing

HPCwire >> Features

Wolfram Alpha: A Web-Based Application That Embraced Supercomputers


It was less than two months ago that Wolfram Alpha launched and introduced the idea of a Web site for universal computation. Wolfram Alpha is based on Wolfram Research's Mathematica, but uses it to drive a general-purpose computational engine that can be applied across more than a thousand knowledge domains.

The launch created quite a bit of fanfare in the media since Wolfram Alpha was seen (incorrectly) as a rival to search engines like Google and Yahoo. The new application also encapsulated the notion of the Semantic Web, which many envision as "the next big thing." Add to that the fact that the Web site was built on top of supercomputers and you have all the ingredients for a juicy high-tech story for the masses.

The supercomputer infrastructure was one of the least talked about aspects of the project, but to our publication, one of the most interesting. Schoeller Porter, who now does business development for Wolfram Alpha, wrote about the pre-launch of the Web site in a recent blog post, and how the project outgrew the initial infrastructure plan even before it booted up.

According to Porter, whom I spoke with shortly after he wrote the blog entry, the project's initial plan devised in February was to roll out Wolfram Alpha on a much smaller scale. The idea was they would make a discrete announcement in the Mathematica community, and users would trickle in. They were anticipating early traffic would be around 200 queries per second. For that kind of computing load, they would be able to get by with a few datacenters populated by modest-sized Web-style clusters -- "normal servers you can buy off the shelf from anywhere," said Porter.

Then in early March, Stephen Wolfram wrote a blog post announcing Wolfram Alpha and they started getting a lot more inquiries about the it. "It clearly hit a nerve in the Semantic Web community," explained Porter. From that point on, they noticed that every time Wolfram gave a speech on the subject, it got more and more press coverage. They soon realized their backroom project was going to get a great deal more attention than they had originally thought. Now they were anticipating that the initial launch would attract something in the neighborhood of 2,000 queries per second -- ten times the original estimate.

As a result, they were forced to scale out the Wolfram Alpha infrastructure. (And thanks to the deep pockets of Wolfram Research, they could do so.) But the time scale was compressed. It was already March and they were looking to launch the site in May. They determined the only way to ramp up the capacity so quickly was to deploy large ready-made clusters, i.e., HPC machines. That's basically why the 576-node cluster from R Systems (R Smarr) and a slightly smaller Dell HPC cluster were added. The other three datacenters consist of much smaller cluster systems using vanilla servers.

According to Porter, strictly speaking they don't depend upon supercomputers for the Wolfram Alpha application. The queries are being handled in parallel, but a tightly-coupled system is not required for that. There's no MPI programming involved. Since Mathematica is the computational engine, the calculations themselves are single threaded, even presumably for operations like matrix multiplication. Aggregating all the queries is where the parallelism comes in, just like any typical Web application.

However, since Wolfram Alpha is all about computation, the extra CPU horsepower and memory performance of HPC servers do not go to waste. Traditional search applications are pretty easy on the CPU, since basically they're just scanning through an index of Web pages. Wolfram Alpha, on the other hand, is doing heavy-duty math, so there is a much greater use of floating point and high precision fixed-point arithmetic. And all the calculations are being done in real time. "Every time you go the Web site and provide an input, the result you get back is generated on the fly," explained Porter.

As you might suspect, computational capacity per query is not unlimited. The software automatically times out if a calculation is hogging the CPU. Thus, for example, the Haferman carpet fractal can be run with an iteration of six, but it quits if the iteration is seven or greater. Similarly, if you try to compute the factorial of 250,000 or greater -- no dice.

But the Web site's biggest stress test is probably ahead of it. The May launch of Wolfram Alpha came just as many universities and high schools were shutting down for the year. Since Wolfram Alpha is ideally suited for students and teachers, especially for math and science course work, it wouldn't be surprising to see a significant uptick in Web site traffic when schools come back into session at the end of August and beginning of September.

According to Porter they expect to expand the infrastructure within the year beyond the 10,000 or so CPU cores they now have deployed. "I expect as we grow, we'll grow at this supercomputer-sized scale," he said. The project team is also reevaluating the infrastructure design to determine if they can improve the system as it scales out. In particular, they're looking at increasing the number of connections from the databases to the compute nodes to maximize throughput.

Since Mathematica currently runs only on commodity processors, for the time being Wolfram Alpha infrastructure will be based on x86 servers. However, Wolfram Research is investigating GPUs and other types of computational accelerators and as support for those technologies are integrated into Mathematica, they will migrate into Wolfram Alpha as well. "But the fundamental limitation isn't the technology itself," explained Porter. "It's how do we enable ordinary folks to be able to take advantage of that technology. I think in some ways Wolfram Alpha is the model to accomplish that."


HPCwire on Twitter

Article Tools

  • Print This Page
  • Bookmark This Article

Share Options

(Digg, Technorati, more)


Subscribe

Discussion

There are 0 discussion items posted.  

HPC in the Cloud Part 2
People to Watch 2010


Top Headlines

Australia Commissions Cray Supercomputer

Mar 19 | OfficialWire | New super to support intelligence work Down Under. Read more...

Intel Partners See 'Easy' Upgrade Path With Xeon 5600 Chips

Mar 18 | ChannelWeb | Westmere parts already showing up in HPC machines. Read more...

AMD: OEMs primed for Opteron 6100s

Mar 17 | The Register | But what about the tier ones? Read more...

Arrival of the Desktop Supercomputer

Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...

Scheduling HPC In The Cloud

Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...

Featured Whitepapers

Virtualization for Aggregation And The vSMP Architecture™

Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.

Copper Cable Technologies for High Performance Computing

Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.

Multimedia

Webcast: Virtualized Data Center Roundtable

Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.

Webcast: Watch SC09 Birds of a Feather Video: Scalable Fault-Tolerant HPC Supercomputers

Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.

Webcast: High Performance Computing for a Smarter Planet

LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html

SC09 HPC in the Cloud

Newsletters

Stay informed! Subscribe to HPCwire email Newsletters.






HPC Job Bank


Featured Events

HPC User Forum DICE
2010 High Performance Computing Linux Financial Markets
Cloud Computing Expo
Cloud Lab
ESC
DEISA PRACE Symposium