Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
August 18, 2008

How the Other Half Does Online Bookselling

Derrick Harris

As shocking as it might seem, there is more than one place to go for your online book-buying needs (kind of; see below). What’s even more shocking, though (from an IT perspective, at least), is that online retailers don’t always need to build mega-infrastructures stocked with homegrown tools in order to stay in business.

Such is the case with AbeBooks.com, an online marketplace for new and used books, including rare and collector’s editions, which is utilizing Oracle’s Coherence data grid solution in order to improve the online shopping experience, as well as to ease backend operations. Online since 1996, AbeBooks has been a Coherence user since mid-2006, initially putting its in-memory capabilities to use on the site’s shopping cart module. According to Leith Painter, manager of development at AbeBooks, the goal was to persist critical information for customers without having to read and write from the company’s main database, thus reducing the load on that database and “improving the buyer experience on our Web site.” AbeBooks has eight sites serving the United States, Canada, the United Kingdom, France, Italy, Germany, Australia and New Zealand, and the Spanish-speaking world, respectively.

However, stopping constant calls to the database isn’t the only benefit AbeBooks has realized from its data grid implementation. Painter says the company has experienced noticeable performance gains, and he loves Coherence’s “stateless behavior.” “Our online site is [stateless,] so, basically, it doesn’t persist information, it gets load balanced through hardware,” he explained. And because Coherence is not totally reliant on the database, it doesn’t have to come down when AbeBooks rolls out new software every three weeks, which means the site stays up and running.

Painter says Coherence is a marked improvement over the legacy methods — cookies, big cache tables in Java memory, and more frequent database calls — and although the gains would be easily quantifiable, AbeBooks was simply concerned with the business benefits and didn’t feel the need to measure things like performance or ROI.

Cameron Purdy, Oracle’s vice president of Fusion Middleware, says AbeBooks story is not uncommon in the e-commerce world. “If you look at the types of features that are added over time to e-commerce applications, they generally start with very simple and obvious things — add something to a shopping cart and check out, that was easy — but then you add something to a shopping cart and it says, ‘Oh, by the way, based on based on some statistical analysis I’ve done, if you bought that, chances are I can also get you to buy something else,’” he explains. “Each one of these features represents more and more information that has to be accessed in real time.” Real-time access is critical in the e-commerce world, says Purdy, because “if you slow down the page going back to the consumer … you lose.”

Another commonality of customers processing this volume of transactions, what many call “extreme transaction processing,” or XTP, is that they often select Coherence because it allows them to address unknown scalability concerns upfront. Purdy says this was the case with AbeBooks, as well, who used Coherence as “a huge insurance policy” in case they needed to scale due to either consumer demand or adoption of new functionalities.

Both reasons probably are true in the case of AbeBooks, but the latter definitely is. AbeBooks’ Painter says the bookseller has “a lot of big strategies revolving around Coherence,” including sharing information between backend services like APIs and batch application, and sharing that information with the Web site without a tight coupling to the database. “We’d like to reduce our tightly couple dependency on our centralized database, so with a service-oriented architecture strategy, we can start developing backend services that can manage cache information independently without being tightly coupled to a database, and hopefully matures along the whole service-oriented architecture strategy,” says Painter.

Currently, the Coherence tier is coupled with the Web site, but AbeBooks also would like to separate the Coherence implementation into its own physical tier that will allow the company’s Coherence clusters at its two datacenters — located in Victoria, British Columbia, and Calgary — to communicate with one another. AbeBooks sometimes has to do hard outages for two to three minutes, which can result in having to bring down a cluster, as well. “We’d like [our Coherence clusters] to communicate with one another, so when we do roll in software changes, we don’t have to bring our Coherence cluster down at all and impact our users or our buyers,” says Painter.

AbeBooks’ inventory includes more than 110 million books from about 13,500 sellers, and processes up to 30,000 orders per day. The company also handles millions of inventory updates every day — including prices changes, quantities and other related information — that must be delivered to the customer front-end systems in near real time. Aside from Coherence (which AbeBooks actually originally adopted as a Tangosol product before that company was acquired by Oracle), AbeBooks manages this data using a collection of other Oracle products, as well. A four-node Oracle RAC cluster spans the Victoria and Calgary datacenters, real-time replication is handled by Oracle Data Guard, and the company just went through a PCI compliance project using Oracle Streams to replicate a standalone, secure database implementation, Painter says.

About that Other Bookseller …

Earlier this month, plans for an Amazon acquisition of AbeBooks were announced. Still subject to closing conditions, the sale should be finalized around the end of the year, says Richard Davis, PR manager at AbeBooks, who also noted that part of the agreement is that AbeBooks remain a standalone company, keeping its same Web site, staff, etc. As a result, it doesn’t presently look like there will be a technological effect, but, says Davis, “it’s just too early to say how we’re going to work together.”

Although Oracle’s Purdy can’t yet comment on how the Amazon acquisition might affect AbeBooks technologically (Editor’s note: Purdy was asked about the possibility of AbeBooks running on Amazon’s internal infrastructure), he did note that Coherence is a good fit with Amazon’s EC2 service, the publicly available incarnation of its internal computing infrastructure. He doesn’t know whether customers are using the combination in production, but they have used EC2 to test applications leveraging Coherence “because that architecture is very conducive for being able to grab some hardware, roll something out for testing real quickly, and then scrap it when you’re done,” says Purdy. Oracle even has published information on how to make the two work together. (Read more about Purdy’s take on the Coherence-EC2 relationship here.)

E-Commerce and XTP

Purdy believes that data grid architectures, especially those of the in-memory variety, are ideal in situations like e-commerce where there are “large amounts of information and a large volume of access of that information.” Coherence, he says, is “extremely appropriate” for sites trying to do customized searches, real-time notification of changes and acceptance of incoming data streams, or create buffers across multiple datacenters or servers.

Speaking about AbeBooks, Purdy says, “From the outside, it doesn’t necessarily look like information is changing that fast, but on the inside, when they get a chunk of information about what a particular bookstore, for example, has available, they have to back through all the other information they have and reconcile that information. So, it actually can be quite a data-intensive operation, even if, from the consumer point of view, it’s a relatively stable marketplace.”