SCIENCE AND ENGINEERING NEWS
Berkeley, CA — For those who worry that California’s rolling blackouts will wipe out their bank statements and other computerized records, a computer scientist at the University of California, Berkeley, is designing a solution: A data storage system so vast and powerful it will encompass the entire Earth.
OceanStore is a data storage system tough enough to withstand a fire, a hacker attack or even a botched electricity deregulation attempt. By chopping data into encrypted pieces and storing them on computers scattered throughout the Internet, OceanStore expands storage capacity and makes data disaster-proof and available any time, anywhere.
“The goal is to make data storage not only secure and available, but downright impervious to disaster,” said OceanStore’s inventor, John Kubiatowicz. The professor is part of UC Berkeley’s proposed Center for Information Technology Research in the Interest of Society (CITRIS), a joint program with UC Davis and UC Santa Cruz that will create innovations to improve people’s lives.
Why store one chunk of data in Chicago and another in Taipei? Stowing encrypted data throughout the Internet protects it from regional events like power shortages, credit card number theft and crippling “denial of service” attacks where Internet servers are overwhelmed by bogus requests.
Robustness isn’t all OceanStore has to offer. With more people creating Web pages and snapping digital photographs, an Earth-sized storage facility will be needed. A recent study by UC Berkeley researchers found that the world’s total yearly production of print, film, optical and magnetic content requires roughly 1.5 billion gigabytes of storage, the textual content of billions of books.
OceanStore will be easy to access from almost anywhere, a crucial feature as computing becomes increasingly mobile. Instead of carrying around a heavy hard drive inside a laptop computer, one can simply download data via wireless modem.
OceanStore also will make possible so-called ubiquitous computing, where every object has a computer in it, from athletic shoes to toasters. Pervasive information flow is a goal of UC Berkeley’s Endeavor project, an initiative to enhance the interaction between humans and technology. OceanStore will reduce the memory requirements of computerized devices and provide a much-needed backup if the device fails. “If you store your data in a ballpoint pen and then you lose the pen, that is a disaster,” said Kubiatowicz.
OceanStore is a key step in UC Berkeley’s initiative to create Smart Dust, miniature sensors the size of dust motes that can be used to monitor bridges for seismic stability, or to sense and respond to the heating needs of a building’s occupants. These tiny sensors will generate huge volumes of data that only a widespread system like OceanStore could handle. OceanStore and Smart Dust sensors are components of CITRIS, the UC Berkeley proposed center.
Perhaps the most important and unique feature of OceanStore is data security. To ensure a document cannot be read by anyone but its owner, OceanStore makes several copies of the data, encodes them using a special coding mechanism, and then chops each one into various-sized fragments.
To keep track of these billions of bits of information, OceanStore generates for each document a permanent, globally unique identification (G.U.I.D.) tag. The document is then split into fragments, which are sent out via the Internet to OceanStore servers. OceanStore also creates a map showing the possible paths between the interconnected Web servers.
To retrieve a chopped-up 1989 tax return, for example, OceanStore sends a team of messengers onto the Internet looking for its G.U.I.D. As the messengers search, they leave behind trails of digital breadcrumbs so that, the next time, the messenger can find the data more quickly.
But downloading data from, say, China, could take a lot longer than getting it from a desktop hard drive. OceanStore gets around this by storing often-used documents on nearby servers. OceanStore also can analyze patterns of usage so that the system can fetch the data more quickly. Its software acts like a handyman, constantly checking data segments and making minor adjustments and repairs.
In the OceanStore world, no server can be trusted. Instead, the data is distributed among multiple clusters of servers in a redundant way so that, if one goes down, the data can be reassembled using only one-fourth of the original fragments. “This storage mechanism is very much like a hologram, where you only need a certain subset of the data to recreate the entire image,” said Kubiatowicz.
To manage all these bits of data floating around the Internet, Kubiatowicz envisions a system not unlike our current telephone company model. Each user would pay a monthly data storage fee to a data service provider, such as an Internet service provider, or I.S.P. The I.S.P. then arranges to store the user’s data on another I.S.P.’s Internet server for a small fee. The I.S.P.s can then swap data amongst themselves, trading fees for using each other’s infrastructure and file servers.
With demand for storage space growing and the reliability of local electricity in question, technology companies are rushing to back OceanStore. They include industry giants like I.B.M. Corp., Nortel Networks Corp. and the data storage company EMC Corp. as well as federal agencies like the National Science Foundation and the Defense Advanced Research Projects Agency.