Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
December 9, 2013

Reality Check on Liquid Cooling in the Data Center

Evaluating liquid cooling for a data center requires an understanding of both the technical approach and ability of the solution to address the business and operational concerns of the data center.

To be practical, it must not only be technically sound but must provide reduced operational costs (OpEx) and/or capital expenditures (CapEx) and must satisfy other metrics such as serviceability,  monitoring, redundancy, failure isolation and even server warranty coverage.

Approaches to Liquid Cooling

Nearly every data center is liquid cooled today.  Most data centers bring liquid into Computer Room Air Handlers (CRAH) or Computer Room Air Conditioning (CRAC) units to cool the air in the data center.  CRAH units bring chilled water in to cool the air. The refrigerant comes into CRAC units as a liquid.  It is not a question of should we liquid cool? It’s a question of how to liquid cool most efficiency?  How to get server heat into liquid to reduce energy costs, mitigate expansion costs and enable increased density?

There are three significant drawbacks with CRAC and CRAH units. The first is that to produce cold enough air to cool servers, the liquid coolant coming to the data center (facilities liquid) must be refrigerated to a temperature colder than ambient (outdoor) air. Chilling is expensive. The second is CRAC and CRAH units produce cold air at the periphery of the data center and considerable effort is needed to move it to the racks. Third, considerable effort is needed move it through the servers via server fans.

Rear door, in-row and over-row liquid coolers focus on reducing the cost of moving air about the data center by placing the air-cooling unit as closer to the servers. For example, Rear Door Coolers replace rear doors on the rack with a liquid cooled heat exchanger that transfers server heat into liquid as hot air leaves the servers. The servers are still air-cooled and facilities liquid must be brought into the computer room at the same temperatures as is needed for CRAH units, <65 degrees F and that liquid exits at <80 degrees F.  While air handling is simplified expensive chillers are still required and server fans still consume the same amount of energy.

“Direct Touch” cooling replaces air heat sinks with ‘Heat Risers’ which transfer heat to skin of server chassis where cold plates between servers transfer heat to refrigerant so the heat can be removed from the building. This eliminates fans in the server and the need to move air around the data center for server cooling.  However facilities are still needed to cool the refrigerant to <61°F and cold plates between the servers reduce the capacity of a 42U rack to ~35 RUs.

Immersion cooling uses an all liquid path to remove server heat by placing servers in tanks of dielectric fluid or filling custom servers with dielectric fluid.  Key concerns with this technology are the maintenance of servers, large quantities of oil-based coolant in the data center, modification of servers with non-stand parts and poor space utilization as in effect the “racks” are lying on their backs.

The most significant advancement in practical liquid cooling can be seen in Asetek’s RackCDU™ Direct-to-Chip (D2C™) hot water cooling system.

D2C hot water liquid cooling brings cooling liquid directly to the server components that generate the most heat within a server and cools the remaining components with air. This solution removes 60% to 80% of the heat generated by servers with an all-liquid path. Pumps replace fan energy in the data center and server, and hot water eliminates the need for chilling the coolant. The air-cooled side of the solution is also more efficient as lower volumes of warmer air are sufficient to cool the remaining components. D2C liquid cooling dramatically reduces chiller use, CRAH fan energy and server fan energy, delivers energy savings of up to 80% and server rack density increases of 2.5x-5x times compared to air-cooled data centers.

Addressing the Business and Operational Concerns of Data Centers

Cost Containment be it CapEx or OpEx is a necessity for data centers.  30+ Kw/racks enabled by Asetek’s RackCDU D2C enable consolidation and mitigate the need for build-outs.  Cost is further reduced by the use of hot water for cooling. CPUs run quite hot (153°F to 185°F) and hotter for memory and GPUs.  The cooling efficiency of water (4000x air) allows it to cool the components with a much smaller temperature difference than air.  This also reduces the power required for server fans.

The data center does not need all the CRACs or CRAHs normally required and rather than needing an extensive chiller plant outside the data center can use cheap dry coolers.  This is a major impact on both CapEx and OpEx.

Monitoring and Alarming is essential for any technology in the contemporary data center.  Asetek’s RackCDU system includes a software suite that provides monitoring, alerts, including temperatures, flow, pressures and leak detection and importantly can report into data center management software suites.

Failure Isolation is a key metric for data centers. Servers using Asetek’s RackCDU use very low pressure and are insolated in closed loops that exchange heat in the CDU with the facilities water loop.  This is an important difference to cooling systems that use a centralized pumping system. Centralized systems require high pressures and hence the risk of high pressure leaks and wide “blast radius.”

Redundancy is one of sacred cows of data centers.  Asetek’s RackCDU D2C CPU and GPU pump /cold plates are drop in replacements for air heat sinks.  One pump is sufficient to drive the required cooling for the server.  Hence a dual CPU, dual GPU or CPU + GPU server contains its own redundant pumping.

Serviceability is a key requirement for any data center hardware system.  Because the Asetek RackCDU is an extension to a standard rack and has independent quick connects for each server, data center facilities teams can remove or replace servers for repair or upgrade as they do today.

Warranty is an issue with installing after market liquid cooling solutions in that it can void the server manufacture’s warranties.  Asetek as also addressed this issue by teaming up with Signature Technology Group (STG), a warranty service and support firm that will maintain coverage for systems that have been upgraded with Asetek’s liquid-cooling technology.

The reality check on data center liquid cooling is that Asetek has moved liquid cooling from an exotic technology to a practical option for data center operators.

asetek.com

SC14 Virtual Booth Tours

AMD SC14 video AMD Virtual Booth Tour @ SC14
Click to Play Video
Cray SC14 video Cray Virtual Booth Tour @ SC14
Click to Play Video
Datasite SC14 video DataSite and RedLine @ SC14
Click to Play Video
HP SC14 video HP Virtual Booth Tour @ SC14
Click to Play Video
IBM DCS3860 and Elastic Storage @ SC14 video IBM DCS3860 and Elastic Storage @ SC14
Click to Play Video
IBM Flash Storage
@ SC14 video IBM Flash Storage @ SC14  
Click to Play Video
IBM Platform @ SC14 video IBM Platform @ SC14
Click to Play Video
IBM Power Big Data SC14 video IBM Power Big Data @ SC14
Click to Play Video
Intel SC14 video Intel Virtual Booth Tour @ SC14
Click to Play Video
Lenovo SC14 video Lenovo Virtual Booth Tour @ SC14
Click to Play Video
Mellanox SC14 video Mellanox Virtual Booth Tour @ SC14
Click to Play Video
Panasas SC14 video Panasas Virtual Booth Tour @ SC14
Click to Play Video
Quanta SC14 video Quanta Virtual Booth Tour @ SC14
Click to Play Video
Seagate SC14 video Seagate Virtual Booth Tour @ SC14
Click to Play Video
Supermicro SC14 video Supermicro Virtual Booth Tour @ SC14
Click to Play Video