Session:
Paper Number: 173066
Swot Discussion (Strengths, Weaknesses, Opportunities, and Threats) of Data Center Cooling
Artificial Intelligence (AI) is transforming and accelerating development and innovation in almost every aspect of our lives. AI encompasses different categories of techniques that are used in different fields: machine learning (e.g. fraud detection), predictive analytics (e.g. climatic-event forecasting), natural language processing (e.g. automated document processing), computer vision (e.g. autonomous driving), robotics (e.g. industrial automation), expert systems (e.g. product design), and generative AI (e.g. software development). AI computations (both training and inference) take place in data centers (DCs). While AI and DCs existed for years, the release of publicly available Gen AI in 2022 has accelerated the utilization of AI in organizations (e.g. from 55 % in 2023 to 72 % in 2024)1. The boom of high-performance computing (HPC) for AI applications has accelerated the demands and developments in the DC cooling space. In other words, DC cooling is a strategic enabler for speed and capacity of HPC, and hence an enabler for development and innovation in the different domains (e.g. cyber security, grid resilience, drug development, and others).
In 2024, DCs accounted for about 1.5 % of the global electricity consumption (circa 415 TWh: Tera Watt hours). DC cooling consumes about 7% to 30% of the total electricity consumption in data centers, which exceeds the electricity consumption for data storage and networking, while the actual servers consume about 60 % 2.
This paper reviews the DC cooling strategies on the system and component levels. This is done in a SWOT (Strengths, Weaknesses, Opportunities and Threats) framework. The strengths section summarizes the relationship between the effectiveness of DC cooling and the capacity and speed of HPC. It highlights the trends in miniaturizations of chips. It then reviews the current chip cooling techniques (air, liquid, and immersion cooling), as well as overall DC cooling system designs (cooling towers, dry cooling, coolant distribution units (CDUs), Chillers, Direct Expansion (DX), and hybrid/flexible systems, others).
The weaknesses section highlights the cooling and packaging challenges arising from the augmentation trends in chip heat flux dissipation, non-homogeneous heat generation and hot spots, chip stacking, growing stringent cooling requirements for server components. It also summarizes the current electricity, water, and other challenges associated with DC cooling (e.g. corrosion with the rise in liquid flows). The tradeoff between component size & cost and system efficiency (e.g. approach temperature differences). Control challenges (e.g. real-time capacity-demand management, managing different cooling temperature and flow requirements among components). Also, the challenge of designing a cooling system for future expansions and growth in cooling requirements (i.e. design for unknown AI and chip cooling power demands).
The opportunities section discusses other parts of the cooling equation that can enhance DC cooling (e.g. optimizing chip design for cooling, computation demand management). This section also highlights advancements in chip cooling (e.g. flow boiling), heat recovery from DC cooling (e.g. combined heat and power, heat source for heat pumps). Moreover, the role of digital twins and the use of DC cooling innovations as a catalyst for innovation in other fields are also presented. The alignment between the road maps for AI applications (software); chip and server design (hardware), and cooling architecture; and how all of that funnel into flexible and future expandable solutions. Shifting power demands from cooling to more computation is also discussed.
The threats section deals with the what if scenarios for some of the challenges. More specifically impacts of DC cooling on: powerplant capacity, peak power demand, water availability, and other environmental and societal impacts.
The presentation concludes by shedding light on some of the areas for future research in DC cooling, to harness the power of AI without compromising efficiency, performance or resources.
References
1. Strategic imperatives: Prioritizing your AI transformation for government, Vertiv, 2025
2. Energy and AI, International Energy Agency (IEA), https://www.iea.org/reports/energy-and-ai/energy-demand-from-ai#abstract, as accessed 05/05/2025
Presenting Author: Mina Mikhaeel Oak Ridge National Lab
Presenting Author Biography: Mina is an R & D Staff at ORNL, working on projects related to thermo-fluids and their applications in energy systems. Before joining ORNL, Mina led a small team working on efficiency, performance, and thermal comfort in EVs at Lucid Motors in California. Mina obtained his PhD from the University of Illinois Urbana-Champaign, working on regime transitions in falling film flows. He obtained his MSc from the Royal Institute of Technology (KTH) in Sweden, working on variable speed compressors in ground source heat pumps.
Authors:
Mina Mikhaeel Oak Ridge National LabBrian Fricke Oak Ridge National Lab
Kashif Nawaz Oak Ridge National Lab
Swot Discussion (Strengths, Weaknesses, Opportunities, and Threats) of Data Center Cooling
Paper Type
Technical Presentation