7 Comments

Thank you again for your interest in our research.

I'll add the following to conclude my responses.

Your post contains multiple factual mistakes. Most notably, you repeatedly suggest indirectly or use directly the term "overstate" qualitatively without providing any quantitative assessment, except for your fundamental mistake in claiming a "30-fold overestimation." This factual error arises from your misinterpretation of "10% of permitted emissions," as reflected in your statement:

“If we take the preprint assumptions to mean 10% annual capacity (this assumption has been clarified in the comments on this post), equivalent to roughly 36 days of continuous use, then this starkly contrasts with industry norms and represents a 30-fold overestimation.” (Note: The sentence "(this assumption has been clarified in the comments on this post)" seems to have been added recently after you realized your factual mistake, but you still keep your misleading and irresponsible conclusion "30-fold overestimation".)

Whether this is due to your lack of knowledge in this field or other motivation and however you spin your claims in your later replies, your interpretation is factually wrong. As explicitly stated in our paper, we did not assume 10% of a year’s time, and the numbers we used are transparently disclosed.

Your post containing factually incorrect claims in its current form misrepresents our research, distorts the understanding of the field, and ultimately affects your own credibility.

Expand full comment

I appreciate your continued interest in our research. However, I encourage you to engage with the findings accurately rather than spreading misinformation. Constructive discussions are always welcome, but misrepresenting technical details only demonstrates your lack of knowledge in this space and does not contribute to an informed dialogue.

1. “The actual emissions are 10% of the permitted level” refers to 10% of the emission limits set in backup generator permits—not 10% of a year’s time. While the actual emissions vary case by case, this figure serves as a reference and aligns with publicly disclosed government reports from Washington and Virginia.

2. The Berkeley Lab report was published after our study. Nonetheless, even our highest 2030 projection (519 TWh) remains within the 2028 range projected by Berkeley Lab. Thus, our estimates are on the conservative side. If you're interested in our updated estimates (to be included in our forthcoming update) based on Berkeley Lab's projection, please read: https://www.linkedin.com/posts/shaolei-ren-68557415_estimates-of-the-public-health-cost-caused-activity-7298237825433448449--pVD

3. We consider the training energy consumption, not inference. Even when incorporating TDP and accounting for PUE, our estimates remain conservative since they do not include server energy overheads, which typically add 20–50% to GPU energy consumption. For a more detailed discussion on training energy estimates, I recommend you review the literature (e.g., arXiv:2104.10350 and arXiv:2211.02001).

4. I also encourage you to read our updated paper on AI’s water footprint (arXiv:2304.03271), which has been accepted for publication in Communications of the ACM.

Expand full comment

Thanks for your comment.

1. The paper is unclear on what basis the calculations are made - no numbers are provided, as I mentioned in my analysis. It repeatedly mentions time when discussing diesel generators e.g. "The backup generators are assumed to emit air pollutants at 10% of the permitted levels per year." and "This trend may necessitate extended reliance on backup generators, e.g., possibly 15 days per year".

I suggested that this could be clarified by publishing the calculations behind the paper. A sensitivity analysis exploring a range of usage scenarios, particularly using publicly available emissions data from operators, could better contextualize this estimate and address variability across regions or operators.

2. It's fair to point out that the Berkeley study postdates the paper, however this is still in the high end of the range without justification. The reliance on a McKinsey “medium scenario” white paper, without transparent methodology or public access to the underlying 2023 Global Energy Perspective, leaves readers unable to assess its basis. Greater clarity on why 519 TWh was selected would strengthen the analysis.

3. I didn't discuss inference - this was in reference to training. See the link I included about the risks of using TDP, which does not accurately represent training energy consumption: https://www.devsustainability.com/p/how-useful-is-gpu-manufacturer-tdp.

I have written about AI energy consumption, including referencing the literature you mention, in the past e.g. https://www.devsustainability.com/p/expect-more-overestimates-of-ai-energy. One of the articles you reference has since been fully published at https://ieeexplore.ieee.org/document/9810097 and shows that "by 2030, total carbon emissions from training will decline.".

4. Thanks for the updated reference.

Expand full comment

Thanks for your follow-up.

1. Your implication that "we did not publish the calculations behind the paper" is still factually wrong. We clearly state at the end of Page 4 that “We show the county-level health cost and the top-10 counties in Figure 1, while deferring the details of calculations to Appendix A.3.” Please see Appendix A.3 for the details, which clearly show that 10% is not “10% of a year”! On Page 4, we also provide the permitted emission numbers, and the 10% following those numbers is clearly referring to 10% of the permitted emissions rather than 10% of a year. The title of your post is misleading at best and undermines our research.

The COBRA is mostly a linear model, and hence 5% of the permitted emissions is simply half of our estimated cost and 20% of the permitted emissions is simply double of our estimated cost. As Scope-1 and Scope-3 impactts are not part of our main analysis in Section 4, showing 5% or 20% emission levels simply adds little value.

2. There is no valid point in your concern. Our approach was straightforward: In addition to EPRI’s, we wanted to use another projection (in this case, McKinsey’s) as an alternative scenario. Meanwhile, to keep our results clean, we simply chose the “medium growth” scenario provided by McKinsey. Now that the Berkeley Lab report is available, we are in the process of integrating its 2028 projection, as noted in my earlier LinkedIn post. Importantly, our core insights remain unchanged.

3. For training, unless special techniques such as power capping are proactively applied, the average GPU power is typically in the range of 70-85% of the TDP (see Figure 4 in “Characterizing Power Management Opportunities for LLMs in the Cloud” published in ASPLOS’24). In a server, the provisioned peak power for GPU is ~50% of the total power (Figure 3 of the ASPLOS’24 paper), while the non-GPU energy is roughly 20-50% of the total energy. Thus, the total server energy is typically even higher than the GPU energy assuming it’s running at full TDP. In our study, for simplicity, we consider energy (based on TDP power) multiplied by PUE. This is a well-accepted methodology that yields relatively conservative estimates for the training energy. In fact, considering TDP for GPU is the **official** methodology used by Meta: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md

I encourage you to stay updated on the state of the art from real systems and architectures.

Misc.: I provided arXiv links just for simple reference. Both of the two papers I referred to have been published officially elsewhere, but arXiv links are simple and easy (e.g., without paywall issues).

Expand full comment

1. Your clarification that the 10% figure reflects permitted emission levels, not runtime, is helpful. However, the paper’s language on Page 5 remains ambiguous:

"This trend may necessitate extended reliance on backup generators, e.g., possibly 15 days per year. Such prolonged usage of diesel generators could substantially elevate AI’s scope-1 air pollution, creating even higher public health costs. Concretely, if the backup generators in northern Virginia emit air pollutants at the maximum permitted level, the total public health cost could reach $2.2-3.0 billion per year."

The juxtaposition of "15 days per year" and "maximum permitted level" muddles the picture, suggesting a linkage that may mislead readers. This likely contributed to my initial interpretation ("10% annual capacity"), though my broader analysis holds regardless. The issue is significant: this ambiguity could undermine confidence in your health cost estimates, which hinge on accurate Scope 1 emissions.

Comparing your figure to operator published data would be useful. Recent sustainability reports from major operators show minimal Scope 1 contributions.

My critique stems from legitimate concerns about clarity and accuracy that could be addressed. The paper’s language could be sharpened to prevent misinterpretation, and the 10% assumption warrants scrutiny given the potential for overestimation—supported by industry trends and usage patterns.

2. The inaccessibility of the McKinsey report limits scrutiny of the "medium-growth" scenario. Transparency is critical in academic projections, particularly for long-term estimates. My 2022 Joule publication https://www.cell.com/joule/fulltext/S2542-4351(22)00358-0 documented a pattern of overestimation in data center energy studies due to opaque assumptions. This history amplifies the need for justification why the medium-growth scenario chosen over alternatives.

3. As you note, actual power draw as measured by real systems is not the maximum TDP. Using the maximum risks overestimating the results.

I am commenting on the preprint version currently available, which is what was publicized by the media. This is the problem with preprints - preliminary findings often gain traction without peer review. Later revisions rarely correct the public record because the press has moved on.

Expand full comment

Thanks for your follow-up. It’s clear that you’re continuing to make baseless claims due to a lack of knowledge in this field, and I believe your response may not be carried out in good faith.

1. You’re entitled to any factually incorrect interpretations you choose. However, we have provided detailed calculations, and while you may argue that our explanations are unclear, that remains a subjective opinion (likely due to your lack of knowledge in the field).

Regarding corporate ESG reports, companies disclose scope-1 carbon emissions but do not report scope-1 criteria air pollutants. As for backup generators, they typically emit CO₂ at rates similar to natural gas power plants (within the same order of magnitude) but release many tens and even hundreds of times more NO2 than a gas power plant. Thus, the ESG report provides little value for studying criteria air pollutants. Regardless, the scope-1 air pollutant impact is less than 10% of the scope-2 impact. And, our Section 4 is entirely on scope 2. (The fact that you brought up ESG reports when referring to criteria air pollutants shows that you may not be familiar with the field.)

2. Your primary argument appears to be that our use of McKinsey’s 2030 medium estimate (519 TWh) is too high—yet this figure still falls within Berkeley Lab’s 2028 projected range. If 519 TWh for 2030 is excessive, then Berkeley Lab’s 2028 high estimate is even higher. Our estimates are based on existing, credible projections that we consider reasonable. Not being able to access the reference for free is not a valid reason to question its credibility.

3. You are ignoring the fundamental point: Our methodology—the same as Meta’s official approach—actually results in lower estimated energy consumption than what is observed in reality. Applying the peak PDU (used by Meta) and ignoring the server non-GPU energy will yield conservative estimates for training energy. You intentionally ignored the server non-GPU energy part.

Misc.: All the discussions here are about your false claims (e.g., "30x overestimation") on the preprint.

Expand full comment

thanks this made me not nervous to learn more on this topic because i am just a undergrad student with little knowledge please tell me would you've any suggestions for learning more Mr. Ren?

Expand full comment