A modern silicon chip running at full load generates roughly 100 watts per square centimeter — about the same heat flux as a kitchen stove burner. A rack of GPUs running an AI training workload generates more heat per cubic meter than the engine compartment of a passenger car. The data center industry’s entire cooling stack — chillers, cooling towers, fans, ducts, water pipes, and now liquid loops directly touching the silicon — exists for one reason: to move that heat from the chip to the atmosphere fast enough that the chip never throttles.
Why Cooling Dominates the Energy Story // 為什麼冷卻主導能源故事 #
In the PUE family covered in article 5, the cooling load factor (CLF) is almost always the largest component of overhead. Typical CLF values sit between 0.20 and 0.50 — meaning cooling consumes 20 to 50 watts for every 100 watts the IT equipment draws. The next-largest factor, electrical distribution losses (PLF), sits at 0.08 to 0.15.
This means that any data center talking seriously about energy efficiency is, structurally, talking about cooling. Every major PUE-reduction story — Nordic free cooling, evaporative cooling in dry climates, direct-to-chip liquid cooling for AI workloads — is fundamentally a cooling story.
這意味著任何認真討論能效的數據中心,結構上都在討論冷卻。每個主要的 PUE 改善故事 —— 北歐自然冷卻、乾燥氣候的蒸發冷卻、AI 工作負載的直接接觸晶片液冷 —— 本質上都是冷卻故事。
A facility moving from PUE 1.5 to PUE 1.2 is, in 80% of cases, a facility that found a better way to move heat.
一座機房從 PUE 1.5 降到 1.2,80% 的情況下是因為它找到更好的散熱方法。
Part 1 — Cabinet Density Decides the Cooling Technology // 第一部分:機櫃密度決定冷卻技術 #
The most important number for choosing a cooling architecture is not climate or building size — it is the power density per cabinet. As density rises, the physics of air-based cooling stops working, and the industry’s cooling stack changes underneath it.
Liquid cooling (direct-to-chip or immersion) 液冷(直接接觸晶片或浸沒式)
AI training, HPC AI 訓練、HPC
The physical reason air cooling stops working // 氣冷停止運作的物理原因 #
The heat capacity of air is low. Even with aggressive ducting and high-velocity fans, the practical ceiling for air-cooled racks sits around 20–25 kW per cabinet. Above that, the fan power required to move enough air becomes its own significant share of the facility’s energy budget, and the air cannot remove heat from chip-surface hotspots fast enough.
Liquid carries roughly 3,000 times more heat per unit volume than air. Once cabinet density crosses the 20 kW threshold, liquid cooling stops being an option and becomes a requirement.
The 30-year history of data center cooling is, simply, the story of moving the cold source closer and closer to the heat source. Room → row → cabinet → chip. Every step shortens the path; every step lowers PUE.
The DX architecture is closest to a residential air conditioner. A refrigerant circulates in a closed loop: it absorbs heat from the indoor space (evaporator), is compressed, and releases heat outdoors (condenser).
The dominant architecture for facilities above 1 MW. Two variants share the same indoor equipment but differ in how the chiller rejects heat to the outdoors.
1 MW 以上機房的主導架構。兩個變體共用室內設備,但冷水機把熱排到戶外的方式不同。
Air-cooled CW: Chillers reject heat directly to outside air. Simpler, lower CAPEX, slightly lower efficiency at full load. Used where water is scarce or freezing is a serious risk.
Water-cooled CW: A second water loop runs through a cooling tower, where evaporation provides the heat rejection. Higher full-load efficiency (centrifugal chillers can reach COP 6–7), supports plate heat exchangers for winter “free cooling,” but more components and more single points of failure.
Water-cooled CW is the most thermally efficient classical architecture, but the most operationally complex. Multiple loops, water treatment requirements, and freeze protection mean it requires substantially deeper engineering and operations capability than DX or air-cooled CW.
AHU and EHU — Indirect Evaporative Cooling // AHU 與 EHU:間接蒸發冷卻 #
Air-Handling Units (AHUs) and the newer Environmental Handling Units (EHUs) use the outside air directly as the heat sink, but through a heat exchanger so that outdoor air never enters the data hall.
Low outdoor temperature — Direct sensible heat exchange (no water, no compressor). Cheapest mode.
戶外低溫 —— 直接顯熱交換(不用水、不用壓縮機)。最便宜的模式。
Moderate outdoor temperature — Add evaporative cooling on the outdoor side (water spray on the heat exchanger).
戶外中溫 —— 外側加上蒸發冷卻(向熱交換器噴水)。
High outdoor temperature — Add mechanical compression for peak summer days.
戶外高溫 —— 加上機械壓縮以應付夏季高溫日。
AHU/EHU architecture works best in cool, dry climates and is the dominant architecture in Northern Europe and the cooler parts of China. The EHU variant adds integrated AC/DC power conversion (so cooling does not depend on the main UPS), AI-driven dual damper control, and continuous-cooling guarantees during partial failures.
Part 3 — Free Cooling: the Largest Single PUE Lever // 第三部分:自然冷卻 —— 最大的單一 PUE 槓桿 #
Free cooling means using outside conditions — air, water, or both — to cool the data center without running compressors. When the outside temperature is below the temperature needed at the chilled-water loop or air handler, free cooling can replace mechanical refrigeration entirely, eliminating the largest single component of the cooling energy bill.
Direct Free Cooling — Outside air is filtered and pumped directly into the data hall. Used in Nordic-style facilities and parts of the Google fleet. Requires extremely clean air and tolerance for short humidity excursions.
直接自然冷卻(Direct Free Cooling) —— 過濾後的外界空氣直接送進機房。Nordic 風格機房與部分 Google 機房用這個。需要極乾淨的空氣與對短暫濕度波動的容忍度。
Indirect Free Cooling — Outside air cools the data hall through a heat exchanger. The technology behind AHU/EHU. Slightly less efficient than direct, but immune to outdoor pollution.
Refrigerant Pump Free Cooling — A clever dual-cycle design that runs the refrigerant loop without engaging the compressor when outdoor temperature is below about 10°C. The pump moves the refrigerant; ambient air cools it; the compressor sleeps. A recent Huawei FusionModule deployment in Beijing reported annual average PUE 1.111 using this technology — a world-class number in a climate that is not naturally cold.
This table explains, in a single view, why Google operates major facilities in Hamina, Finland; why Meta is building out Luleå, Sweden; why Inner Mongolia and Guizhou have become Chinese data center hubs; and why Singapore introduced a moratorium on new data centers in 2019.
這張表用一個視角解釋了:為什麼 Google 在芬蘭哈米納運轉主要機房、為什麼 Meta 在瑞典 Luleå 擴建、為什麼內蒙古與貴州變成中國數據中心樞紐、為什麼新加坡 2019 年宣布暫停新建數據中心。
Part 4 — Evaporative Cooling and the PUE-vs-WUE Trade-off // 第四部分:蒸發冷卻 —— PUE 與 WUE 的權衡 #
Evaporative cooling uses the physics of water phase change: every kilogram of water that evaporates absorbs about 2,260 kJ of heat. That is roughly 25 times the energy a refrigeration cycle moves per kilogram of refrigerant. For dry-climate sites, evaporative cooling is the most efficient cooling mode physics allows.
The trade-off introduced in article 5 between Power Usage Effectiveness and Water Usage Effectiveness has its sharpest expression in evaporative cooling:
第 5 篇介紹過的 PUE 與 WUE 之間的權衡,在蒸發冷卻上有最尖銳的展現:
Sites using evaporative cooling typically have WUE between 1.5 and 2.5 L/kWh.
Sites that avoid evaporative cooling can reach WUE below 0.5 L/kWh — but at the cost of higher PUE.
用蒸發冷卻的機房 WUE 典型落在 1.5 到 2.5 L/kWh。
避免蒸發冷卻的機房可以達 WUE 低於 0.5 L/kWh —— 但代價是較高 PUE。
For a 1,000-cabinet facility, this translates to roughly 63,000 tons of water per year — the annual usage of about 300 households. In water-stressed regions like the United Arab Emirates, central and southern Taiwan, parts of Spain, and Singapore, this number now drives regulatory action and increasingly shapes site selection.
Over the next decade, WUE caps are likely to displace PUE caps as the binding regulatory constraint in water-scarce regions. Operators that built their PUE strategy around evaporative cooling will find themselves forced to rethink.
未來十年,在缺水地區,WUE 上限有機會取代 PUE 上限成為主導性法規約束。把 PUE 策略建立在蒸發冷卻上的營運者將被迫重新思考。
Part 5 — Liquid Cooling: The AI-Driven Transition // 第五部分:液冷 —— AI 驅動的轉變 #
The liquid cooling market existed for decades as a niche serving mainframes and HPC clusters. The AI buildout has turned it into the fastest-growing category in the data center industry.
CDU lead times have stretched to 6–12 months in 2025–2026, with allocation favoring hyperscaler customers. For a hyperscale AI build, a missing CDU now delays the same kind of project that a missing chiller delayed in 2020.
CDU 交期在 2025–2026 拉到 6–12 個月,配額傾向 hyperscaler 客戶。對 hyperscale AI 案場來說,少一台 CDU 的影響等於 2020 年少一台冷水機的影響。
Part 6 — Aisle Containment: The Highest-ROI Improvement // 第六部分:通道封閉 —— ROI 最高的改善 #
Of all the cooling-side investments a data center can make, hot or cold aisle containment is consistently the highest-return: small CAPEX, fast payback, immediate PUE improvement, no operational complexity.
數據中心可以做的冷卻側投資裡,冷或熱通道封閉永遠是回報率最高的一項:小 CAPEX、快速回本、立即 PUE 改善、無運轉複雜度。
In an unstructured data hall, cold air supplied to the front of racks mixes with hot air exhausted at the back before being recaptured by the CRAC. This mixing reduces cooling efficiency because the CRAC has to overcool the entire hall to compensate.
Cold Aisle Containment (CAC): Roof panels and end-doors enclose the cold aisle. Cold air supplied to the contained aisle goes only into the front of racks. Hot exhaust flows freely into the rest of the hall and returns to the CRAC.
Hot Aisle Containment (HAC): The hot aisle is enclosed instead. Hot exhaust is channeled directly back to the CRAC return without mixing with the rest of the hall.
For most operating data centers, the question is not whether to add containment but why it has not been done already. The capital cost is genuinely small relative to the PUE savings, and the technology is mature, well-understood, and risk-free to retrofit.
對多數運轉中的數據中心,問題不是「要不要加封閉」,而是「為什麼還沒做」。資本成本相對 PUE 節省真的很小,技術成熟、易理解、改造無風險。
Part 7 — ASHRAE Guidelines and the 27°C Inlet Revolution // 第七部分:ASHRAE 規範與 27°C 進氣革命 #
ASHRAE — the American Society of Heating, Refrigerating and Air-Conditioning Engineers — publishes the technical guidelines that the global data center industry follows for inlet temperature, humidity, and environmental classes.
ASHRAE(American Society of Heating, Refrigerating and Air-Conditioning Engineers,美國暖通冷凍空調工程師學會)發布全球數據中心產業遵循的技術規範,涵蓋進氣溫度、濕度、環境等級。
For decades, data center operators ran their facilities at IT-equipment inlet temperatures around 18–22°C — cool to the touch, often described by visitors as “like a refrigerator.” ASHRAE’s recommendation has since shifted upward; the modern industry guideline is to run facilities at the high end of the recommended range, around 27°C.
幾十年來,數據中心營運者把 IT 設備進氣溫度設在約 18–22°C —— 摸起來冷,常被訪客形容「像冰箱」。ASHRAE 的建議後來上移;現代業界規範是運轉在推薦範圍的上端,約 27°C。
The PUE saving is significant: raising inlet temperature from 22°C to 27°C reduces PUE by roughly 0.05 to 0.10. The saving costs nothing — it is a setpoint change. Server reliability impact, within the recommended range, is statistically negligible.
PUE 節省可觀:把進氣溫度從 22°C 提到 27°C 降低 PUE 約 0.05 到 0.10。這個節省什麼錢都不用花 —— 只是設定點變更。在推薦範圍內,伺服器可靠性影響在統計上微不足道。
A typical legacy facility running at 22°C inlet temperature has a no-cost PUE-improvement lever sitting on the wall. The fact that it is rarely pulled is not a technology issue — it is an organizational one.
一座運轉在 22°C 進氣的傳統機房,牆上掛著一個零成本的 PUE 改善槓桿。它很少被拉動不是技術問題 —— 是組織問題。
Part 8 — Humidification and the Wet-Film Quiet Champion // 第八部分:加濕系統與濕膜的低調冠軍 #
Data centers need humidification because dry air generates static electricity, which can damage IT equipment. The typical target is 40–60% relative humidity.
數據中心需要加濕,因為乾燥空氣產生靜電,可能損壞 IT 設備。典型目標是 40–60% 相對濕度。
Humidifier type
Power draw // 功率
Maintenance // 維護
Scale buildup // 結垢
Electrode-type 電極式
~760 W
Medium
Heavy (heats water to 100°C) 嚴重(加熱到 100°C)
Infrared 紅外線
~1,000 W
Medium
Heavy
Wet-film 濕膜
~50 W
Simple
Light
The wet-film humidifier uses simple evaporation off a wetted media at room temperature. It draws about 5% the power of electrode or infrared alternatives — and yet many legacy facilities still use electrode humidifiers because that was the standard choice in the 2000s.
Data center energy efficiency is, in the end, the accumulation of many small optimizations. Replacing electrode humidifiers with wet-film is the kind of unglamorous change that, multiplied across a thousand cabinets, saves a meaningful amount of electricity every year.
Part 9 — CFD Simulation: The Design-Time Tool // 第九部分:CFD 模擬 —— 設計階段的工具 #
Computational Fluid Dynamics (CFD) is the simulation technique that lets engineers model airflow inside a data hall before it is built — predicting hot spots, validating containment effectiveness, and optimizing CRAC placement.
For retrofits, CFD also plays an investigative role: when an existing facility has hot spots that no one can explain, a CFD simulation built from the as-built drawings often shows airflow paths that are obvious in hindsight but invisible in the floor plan.
CLF (Cooling Load Factor) is typically 0.20–0.50, the largest single contributor to overhead. Almost every major PUE-reduction story is structurally a cooling story.
CLF(Cooling Load Factor)典型 0.20–0.50,是 overhead 最大的單一貢獻。幾乎每個主要的 PUE 改善故事結構上都是冷卻故事。
2. Cabinet density chooses the cooling architecture // 機櫃密度決定冷卻架構 #
Below 3 kW: room-level. 5–12 kW: in-row with containment. Above 20 kW: liquid cooling is no longer optional. AI workloads push densities to 30–120 kW, which is why liquid cooling has gone from niche to mandatory inside 24 months.
3. CW dominates above 1 MW, but free cooling is the largest single lever // CW 主導 1 MW 以上,但自然冷卻是最大單一槓桿 #
Water-cooled chilled-water plants are the most efficient classical architecture. Layering free cooling on top — whether direct, indirect, or refrigerant-pump — produces the largest single PUE improvement available.
水冷冷凍水機組是最有效的經典架構。再疊上自然冷卻 —— 直接、間接、或冷媒泵 —— 產生能拿到的最大單一 PUE 改善。
4. Evaporative cooling trades PUE for WUE // 蒸發冷卻用 PUE 換 WUE #
The mechanism that lets dry-climate sites hit PUE 1.1 simultaneously raises water usage to 1.5–2.5 L/kWh. In water-scarce regions, WUE caps are becoming the binding regulatory constraint, displacing PUE caps.
讓乾燥氣候站點打到 PUE 1.1 的機制,同時把用水拉高到 1.5–2.5 L/kWh。在缺水地區,WUE 上限正在成為主導性法規約束,取代 PUE 上限。
5. Liquid cooling is no longer a future technology // 液冷不再是未來技術 #
NVIDIA H100, B200, and especially GB200 NVL72 have pushed cabinet densities past every air-cooling ceiling. CoolIT, Asetek, Vertiv, Submer, and LiquidStack are now mainstream procurement names rather than niche specialists.
6. Containment is the cheapest PUE improvement available // 通道封閉是能拿到最便宜的 PUE 改善 #
$30–$80 per square meter, 6–18 month payback, 0.05–0.15 PUE improvement, no operational complexity. The question for any uncontained facility is not whether to add it but why it has not been added yet.
每平方公尺 $30–$80、6–18 個月回本、0.05–0.15 PUE 改善、無運轉複雜度。對任何尚未封閉的機房而言,問題不是「要不要加」,而是「為什麼還沒加」。
7. ASHRAE’s 27°C inlet recommendation is a free PUE lever // ASHRAE 27°C 進氣建議是免費的 PUE 槓桿 #
Raising inlet temperature from 22°C to 27°C reduces PUE by 0.05–0.10 with no CAPEX. The reason this is rarely done is organizational, not technical.
把進氣溫度從 22°C 提到 27°C 降 PUE 0.05–0.10,不花 CAPEX。很少被執行的原因是組織性的,不是技術性的。
The eighth article in this series turns from the physical infrastructure to the operational nervous system — DCIM (Data Center Infrastructure Management) platforms, the AI-driven cooling and power optimization systems built on top of them (Huawei iCooling, Google DeepMind cooling control, Schneider EcoStruxure), and the digital-twin layer that is starting to make data centers run themselves. The cooling and power chains we have spent the last two articles unpacking are increasingly controlled by software that learns from millions of telemetry points in real time. This is where the data center industry meets the AI industry — not as a customer, but as a co-traveler.
本系列第 8 篇從實體基礎設施轉到運轉的神經系統 —— DCIM(Data Center Infrastructure Management,數據中心基礎設施管理)平台、建立在其上的 AI 驅動冷卻與電力優化系統(華為 iCooling、Google DeepMind 冷卻控制、Schneider EcoStruxure)、以及開始讓數據中心自己跑的數位孿生層。我們過去兩篇拆解的冷卻與電力鏈,越來越被「即時從上百萬遙測點學習」的軟體控制。這是數據中心產業遇上 AI 產業的地方 —— 不是作為客戶,而是作為同行者。