Reading List: Fault Tolerance Techniques for HPC

August 6, 2015

Among the chief challenges of deploying useful exascale machines, resilience looms large. Today's error rates combined with tomorrow's node counts cannot susta Read more…

By Tiffany Trader

Toward a Fault-Tolerant Cloud

June 23, 2011

With the proliferation of public cloud infrastructures, our dependability on them has increased. Many of our vital services pertaining to the research, industry or even lifestyle domain have been massively moved onto the cloud. Then, what happens when the cloud services we are depending on go down? Dr. Jose Luis Vazquez-Poletti shares some key aspects on how the scientific community can provide answers to this problem. Read more…

By Jose Luis Vazquez-Poletti

Looking to Fault-Tolerant Software

November 9, 2010

Achieving workable software-based fault tolerance will require a fresh approach for developers. Read more…

By Tiffany Trader

The Other Exascale Challenge

June 10, 2010

Supercomputing apps may have to ditch the checkpoint-restart model. Read more…

By Michael Feldman

Embrace Failure!

April 22, 2009

Can smart checkpoints and fault-resilient applications avert a Malthusian Catastrophe? Read more…

By Elizabeth Leake, TeraGrid, and Anne Heavey, iSGTW

  • arrow
  • Click Here for More Headlines
  • arrow

Leading Solution Providers

  • Off The Wire

  • Industry Headlines

More Off The Wire

Full & Complete coverage of SC17

Keynote Reviews, Analysts Write Ups, Booth Vidoes, Student Competition, Awards and so much

Did you miss out on Supercomputing 2017? Did you attend, but were stuck in meetings the whole time without an opportunity to walk the show floor and see what new announcements were being made? HPCwire's got you covered, we visited some of the hottest booths in the exhibit hall and spoke with their top executives to get the scoop on the latest solutions, partnerships, and product announcements.

Click here to view HPCwire Coverage of SC17


Avoiding the Storage Silo Trap; Best Practices for Data Storage in Scientific Research

From mismatches between compute and storage capabilities to colossal data volumes, data storage presents a number of challenges for scientific research. And as silos pop up and challenges expand, the pace of research often suffers.

Download this report

Sponsored by Quantum


Creating a Modular, Building-Block Architecture for Life Science Workflows

As genomic data becomes ubiquitous, infrastructure bottlenecks for life sciences organizations are narrowing. But speedy analysis and real-time decision making don't have to remain out of reach: modern end-to-end systems are emerging as flexible solutions for a competitive edge.

Download this report

Sponsored by Re-Store

Advanced Scale Career Development & Workforce Enhancement Center

Featured Advanced Scale Jobs:

Receive the Monthly
Advanced Computing Job Bank Resource:

HPCwire Resource Library

HPCwire Product Showcase

Subscribe to the Monthly
Technology Product Showcase: