New York, N.Y. — Tiernan Ray reports that when the great book of technology is closed on the chapter about the World Wide Web, among the blather about “new paradigms” there will be one transformation that is truly profound. The Web is a massive database, of sorts, and it has changed the way we all use information.
The Web is all about making connections, some mundane, others fascinating. If you surfed to this page in your Web browser from the “front” of SmartMoney.com, you followed a connection that’s perhaps no more interesting than flipping the pages of a magazine. Last week we glimpsed a more intriguing set of connections. On the Web site of the Human Genome Project, the not-for-profit, internationally sponsored version of the massive DNA-sequencing effort, you can actually search through a catalog of human genes and see the different patterns in the bits that make up life.
The Web-as-database has been very good for the world of commercial database products. Dataquest estimates that sales of database software jumped 18% last year to $8 billion. The Web has been especially good for Oracle, now the No. 1 vendor of databases for the Internet. Oracle has come to dominate databases that run on large Unix-server computers – just the kind of computers used for running Web sites, in other words. The company owns 30% of the market for what are known as relational databases, having reduced rivals Sybase and Informix to mere bit players in that business.
But I think things will soon change dramatically for the industry in general and for Oracle in particular, if the company isn’t careful. A new crop of databases on the Internet threaten to reduce Oracle’s core database program to commodity status while integrating more sophisticated and valuable data analysis than Oracle or any other pure database program offers right now. I’m talking about a new breed of database products that leverage the connective tissue of the Internet to combine traditional database functionality – the storage and retrieval of discreet types of information like bank account numbers and names – with powerful data-analysis and data-mining techniques.
Data mining and data analysis, pioneered by such companies as Red Brick, Arbor and MicroStrategy, has been around for some time. Essentially, it’s a series of computer science techniques that allow you to search for hidden and potentially revealing patterns in data that could become the basis for profitable business decisions. But the Internet puts an interesting spin on data mining. Instead of searching for patterns within a narrow company-specific database, the new generation of data engines is using the Web itself to search for relevant and meaningful connections. (Red Brick is now part of Informix and Arbor is part of Hyperion Solutions.)
My favorite practitioner of the art right now is a privately held search engine company called Google. Google sells its search engine capabilities for a fee to major Web sites, most recently scoring a major coup by replacing rival Inktomi as the search engine powering Yahoo!. Google has come up with very good results on many Web searches that in past disappointed. The main trick it uses is peer review: if you search on the word “Intel” at Google’s site, you get a list of links that has the home page of Intel at the top – exactly what most sane users of the Web would expect. That’s because Google decides to show you the Web page that most other pages link to for the keyword “intel,” namely http://www.intel.com .
Google performs other kinds of analysis. It collects extensive logs about how users interact with the Google.com site. The company crunches this data to deduce patterns or connections, good and bad, in the result sets in order to perfect the program. For example, if a search for Mickey Mantle turns up only results on eBay for baseball cards, Google will try and figure out why the findings are lopsided. The result is that Google improves over time the kinds of connections it can draw: search on Intel and you will not only get the home page, but also some relevant press clippings, a connection that Google figures out by looking over the text contained in press releases.
Other examples are emerging. Quiq of San Mateo, Calif., is building databases for Web sites as a fee-based service. An early example can be seen at Ask Jeeves.com, the search engine. If Ask Jeeves can’t find your answer, you can post the question on another page, called Answer Point, and the Quiq technology will slice and dice all questions, as well as answers posted by other surfers, matching up different strains of thought that share keywords, or that have very high page counts and therefore seem popular. Quiq is a bit like Internet newsgroups, where people go for advice, with the difference that using a database, connections and patterns can be established between separate conversations that interlocutors might not have even been aware of.
Quiq and Google have some important things in common. For one, they are developing database technology that goes beyond what Oracle and others sell. Google actually built its own database from scratch, and it is a wholly different type of software, called a “flat file” database, according to Craig Silverstein, Google’s director of technology. Raghu Ramakrishnan, chairman and chief technology officer of Quiq, and a professor of database technology at the University of Wisconsin-Madison, says that while Quiq’s program uses an Oracle database, the company has applied for about four or five different patents on specific technology enhancements it has made.
I think Quiq and Google are both models for the future of the database market. The basic task of capturing and storing discrete pieces of information is a done deal. Oracle has largely won; now it’s time for more sophisticated operations. I expect that the new database business will be built along the lines of a services business, like the kind Quiq and Google are running. Both companies enjoy leverage that would be impossible to achieve in the kinds of data analysis that Red Brick and Arbor advertised. There is leverage across talent and technology. Google’s founders, Larry Page and Sergey Brin, two graduate students in database technology from Stanford University head a team of human editors that review both search results and the kinds of questions that get answered. Their work can be broadly applied because, unlike the kind of precise analysis that Red Brick and Arbor encouraged, Quiq and Google are looking for fuzzy answers and general improvements in their respective approaches to their clients’ needs. It’s still early for both, however: Quiq has announced only AskJeeves, but expects to make more client announcements soon. Google is selling its search services to only a handful of customers besides Yahoo.
Certainly, traditional relational databases like the kind Oracle sells will continue to be a strong market as long as transaction volume grows on the Web. But as with the corporate market a few years back, Oracle’s Web sales will become a mature market at some point, and I believe growth will then come from the new services like Quiq that are helping to make new connections, rather than simply storing data.
Oracle has always tried to keep ahead of slowing sales by moving into other lines of business. Last year Oracle’s Chief Executive Larry Ellison bought the data mining software of Thinking Machines, a supercomputing pioneer. And Oracle rolled out technology for data analysis. Lately, you’re seeing Oracle integrate technology into its database that has nothing to do with traditional relational-database functions. Features like the ability to cache, or store a copy of data for faster access. The just-announced Internet Application Server, or iAS, allows companies to build code to run a Web site and then run that code out of the database. In the orthodox world of transaction-oriented relational databases, that kind of mixing of applications and data has been considered a cardinal sin. Larry is breaking the rules, in other words.
And maybe with good reason. As Internet databases, not corporate databases, become the hot market, more and more of the underlying technology, the database engine, becomes a commodity. When companies like Quiq are building so much new code, it almost doesn’t matter whose database it runs on. It may even be time to take another look at Informix and Sybase. If the brand of software underlying each Web site matters less, there may be a chance for these also-rans to compete in earnest, if they adjust their business models to reflect the commodity aspect of the business. With Informix trading at around $4.50 these days, and Sybase at about $22 a share, the multiples for these stocks are quite attractive: 5.5 and 23, respectively. It’s not often these days you see a stock with any kind of franchise trading at a price-to-earnings growth rate multiple at or below 1. They’re both companies in need of mending, but at least as acquisition candidates, they’re worth a second look.
The views in this article are those of its author and not necessarily those of the publisher or staff of HPCwire.