{"id":4051,"date":"2025-07-01T13:42:48","date_gmt":"2025-07-01T13:42:48","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2025\/07\/01\/how-ai-factories-can-help-relieve-grid-stress\/"},"modified":"2025-07-01T13:42:48","modified_gmt":"2025-07-01T13:42:48","slug":"how-ai-factories-can-help-relieve-grid-stress","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2025\/07\/01\/how-ai-factories-can-help-relieve-grid-stress\/","title":{"rendered":"How AI Factories Can Help Relieve Grid Stress"},"content":{"rendered":"<div>\n\t\t<span class=\"bsf-rt-reading-time\"><span class=\"bsf-rt-display-label\"><\/span> <span class=\"bsf-rt-display-time\"><\/span> <span class=\"bsf-rt-display-postfix\"><\/span><\/span><\/p>\n<p>In many parts of the world, including major technology hubs in the U.S., there\u2019s a <a target=\"_blank\" href=\"https:\/\/www.bloomberg.com\/news\/articles\/2024-08-29\/data-centers-face-seven-year-wait-for-power-hookups-in-virginia?embedded-checkout=true\" rel=\"noopener\">yearslong wait<\/a> for AI factories to come online, pending the buildout of new energy infrastructure to power them.<\/p>\n<p><a target=\"_blank\" href=\"http:\/\/www.emeraldai.co\/\" rel=\"noopener\">Emerald AI<\/a>, a startup based in Washington, D.C., is developing an AI solution that could enable the next generation of data centers to come online sooner by tapping existing energy resources in a more flexible and strategic way.<\/p>\n<p>\u201cTraditionally, the power grid has treated data centers as inflexible \u2014 energy system operators assume that a 500-megawatt AI factory will always require access to that full amount of power,\u201d said Varun Sivaram, founder and CEO of Emerald AI. \u201cBut in moments of need, when demands on the grid peak and supply is short, the workloads that drive AI factory energy use can now be flexible.\u201d<\/p>\n<p>That flexibility is enabled by the startup\u2019s Emerald Conductor platform, an AI-powered system that acts as a smart mediator between the grid and a data center. In a recent field test in Phoenix, Arizona, the company and its partners demonstrated that its software can reduce the power consumption of AI workloads running on a cluster of 256 NVIDIA GPUs by 25% over three hours during a grid stress event while preserving compute service quality.<\/p>\n<p>Emerald AI achieved this by orchestrating the host of different workloads that <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/solutions\/ai-factories\/\" rel=\"noopener\">AI factories<\/a> run. Some jobs can be paused or slowed, like the training or fine-tuning of a <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/large-language-models\/\" rel=\"noopener\">large language model<\/a> for academic research. Others, like inference queries for an AI service used by thousands or even millions of people, can\u2019t be rescheduled, but could be redirected to another data center where the local power grid is less stressed.<\/p>\n<p>Emerald Conductor coordinates these AI workloads across a network of data centers to meet power grid demands, ensuring full performance of time-sensitive workloads while dynamically reducing the throughput of flexible workloads within acceptable limits.<\/p>\n<p>Beyond helping AI factories come online using existing power systems, this ability to modulate power usage could help cities avoid rolling blackouts, protect communities from rising utility rates and make it easier for the grid to integrate clean energy.<\/p>\n<p>\u201cRenewable energy, which is intermittent and variable, is easier to add to a grid if that grid has lots of shock absorbers that can shift with changes in power supply,\u201d said Ayse Coskun, Emerald AI\u2019s chief scientist and a professor at Boston University. \u201cData centers can become some of those shock absorbers.\u201d<\/p>\n<p>A member of the <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/startups\/\" rel=\"noopener\">NVIDIA Inception<\/a> program for startups and an <a target=\"_blank\" href=\"https:\/\/www.nventures.ai\/\" rel=\"noopener\">NVentures<\/a> portfolio company, Emerald AI today <a target=\"_blank\" href=\"https:\/\/www.prnewswire.com\/news-releases\/emerald-ai-launches-with-24-5m-seed-round-to-transform-ai-data-centers-into-grid-allies-302495064.html?tc=eml_cleartime\" rel=\"noopener\">announced more than $24 million in seed funding<\/a>. Its Phoenix demonstration, part of <a target=\"_blank\" href=\"https:\/\/www.energycentral.com\/intelligent-utility\/post\/unlocking-ai-potential-with-data-center-flexibility-PtPoXIAuRMzs5Ff\" rel=\"noopener\">EPRI\u2019s DCFlex data center flexibility initiative<\/a>, was executed in collaboration with NVIDIA, Oracle Cloud Infrastructure (OCI) and the regional power utility Salt River Project (SRP).<\/p>\n<p>\u201cThe Phoenix technology trial validates the vast potential of an essential element in data center flexibility,\u201d said Anuja Ratnayake, who leads EPRI\u2019s DCFlex Consortium.<\/p>\n<p>EPRI is also leading the <a href=\"https:\/\/blogs.nvidia.com\/blog\/open-power-ai-consortium\/\">Open Power AI Consortium<\/a>, a group of energy companies, researchers and technology companies \u2014 including NVIDIA \u2014 working on AI applications for the energy sector.<\/p>\n<h2><b>Using the Grid to Its Full Potential<\/b><\/h2>\n<p>Electric grid capacity is typically underused except during peak events like hot summer days or cold winter storms, when there\u2019s a high power demand for cooling and heating. That means, in many cases, there\u2019s room on the existing grid for new data centers, as long as they can temporarily dial down energy usage during periods of peak demand.<\/p>\n<p>A recent Duke University study <a target=\"_blank\" href=\"https:\/\/nicholasinstitute.duke.edu\/publications\/rethinking-load-growth\" rel=\"noopener\">estimates<\/a> that if new AI data centers could flex their electricity consumption by just 25% for two hours at a time, less than 200 hours a year, they could unlock 100 gigawatts of new capacity to connect data centers \u2014 <a target=\"_blank\" href=\"https:\/\/www.cfr.org\/blog\/america-may-not-need-massive-energy-build-out-power-ai-revolution\" rel=\"noopener\">equivalent to over $2 trillion in data center investment<\/a>.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-medium wp-image-82912\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2025\/07\/pullquote_emeraldai-960x384.png\" alt=\"Quote from article\" width=\"960\" height=\"384\"><\/p>\n<h2><b>Putting AI Factory Flexibility to the Test<\/b><\/h2>\n<p>Emerald AI\u2019s recent trial was conducted in the Oracle Cloud Phoenix Region on NVIDIA GPUs spread across a multi-rack cluster managed through Databricks MosaicML.<\/p>\n<p>\u201cRapid delivery of high-performance compute to AI customers is critical but is constrained by grid power availability,\u201d said Pradeep Vincent, chief technical architect and senior vice president of Oracle Cloud Infrastructure, which supplied cluster power telemetry for the trial. \u201cCompute infrastructure that is responsive to real-time grid conditions while meeting the performance demands unlocks a new model for scaling AI \u2014 faster, greener and more grid-aware.\u201d<\/p>\n<p>Jonathan Frankle, chief AI scientist at Databricks, guided the trial\u2019s selection of AI workloads and their flexibility thresholds.<\/p>\n<p>\u201cThere\u2019s a certain level of latent flexibility in how AI workloads are typically run,\u201d Frankle said. \u201cOften, a small percentage of jobs are truly non-preemptible, whereas many jobs such as training, batch inference or fine-tuning have different priority levels depending on the user.\u201d<\/p>\n<p>Because Arizona is among the top states for data center growth, SRP set challenging flexibility targets for the AI compute cluster \u2014 a 25% power consumption reduction compared with baseline load \u2014 in an effort to demonstrate how new data centers can provide meaningful relief to Phoenix\u2019s power grid constraints.<\/p>\n<p>\u201cThis test was an opportunity to completely reimagine AI data centers as helpful resources to help us operate the power grid more effectively and reliably,\u201d said David Rousseau, president of SRP.<\/p>\n<p>On May 3, a hot day in Phoenix with high air-conditioning demand, SRP\u2019s system experienced peak demand at 6 p.m. During the test, the data center cluster reduced consumption gradually with a 15-minute ramp down, maintained the 25% power reduction over three hours, then ramped back up without exceeding its original baseline consumption.<\/p>\n<p>AI factory users can label their workloads to guide Emerald\u2019s software on which jobs can be slowed, paused or rescheduled \u2014 or, Emerald\u2019s AI agents can make these predictions automatically.<\/p>\n<figure id=\"attachment_82915\" aria-describedby=\"caption-attachment-82915\" class=\"wp-caption aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-82915\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2025\/07\/image2-960x540.jpg\" alt=\"Dual chart showing GPU cluster power and SRP load over time in Phoenix on May 3, 2025, alongside a bar chart comparing job performance across flex tiers.\" width=\"960\" height=\"540\"><figcaption id=\"caption-attachment-82915\" class=\"wp-caption-text\">(Left panel): AI GPU cluster power consumption during SRP grid peak demand on May 3, 2025; (Right panel): Performance of AI jobs by flexibility tier. Flex 1 allows up to 10% average throughput reduction, Flex 2 up to 25% and Flex 3 up to 50% over a six-hour period. Figure courtesy of Emerald AI.<\/figcaption><\/figure>\n<p>Orchestration decisions were guided by the Emerald Simulator, which accurately models system behavior to optimize trade-offs between energy usage and AI performance. Historical grid demand from data provider Amperon confirmed that the AI cluster performed correctly during the grid\u2019s peak period.<\/p>\n<figure id=\"attachment_82918\" aria-describedby=\"caption-attachment-82918\" class=\"wp-caption aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-82918\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2025\/07\/image3-960x540.jpg\" alt=\"Line graph showing power usage over time on May 2, 2025, for simulator, AI cluster and individual jobs.\" width=\"960\" height=\"540\"><figcaption id=\"caption-attachment-82918\" class=\"wp-caption-text\">Comparison of Emerald Simulator prediction of AI GPU cluster power with real-world measured power consumption. Figure courtesy of Emerald AI.<\/figcaption><\/figure>\n<h2><b>Forging an Energy-Resilient Future<\/b><\/h2>\n<p>The International Energy Agency projects that electricity demand from data centers globally <a target=\"_blank\" href=\"https:\/\/www.iea.org\/news\/ai-is-set-to-drive-surging-electricity-demand-from-data-centres-while-offering-the-potential-to-transform-how-the-energy-sector-works\" rel=\"noopener\">could more than double by 2030<\/a>. In light of the anticipated demand on the grid, the state of Texas passed a law that requires data centers to ramp down consumption or disconnect from the grid at utilities\u2019 requests during load shed events.<\/p>\n<p>\u201cIn such situations, if data centers are able to dynamically reduce their energy consumption, they might be able to avoid getting kicked off the power supply entirely,\u201d Sivaram said.<\/p>\n<p>Looking ahead, Emerald AI is expanding its technology trials in Arizona and beyond \u2014 and it plans to continue working with NVIDIA to test its technology on AI factories.<\/p>\n<p>\u201cWe can make data centers controllable while assuring acceptable AI performance,\u201d Sivaram said. \u201cAI factories can flex when the grid is tight \u2014 and sprint when users need them to.\u201d<\/p>\n<p>Learn more about <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/startups\/\" rel=\"noopener\">NVIDIA Inception<\/a> and explore <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/industries\/energy\/power-utilities\/\" rel=\"noopener\">AI platforms designed for power and utilities<\/a>.<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/blogs.nvidia.com\/blog\/ai-factories-flexible-power-use\/<\/p>\n","protected":false},"author":0,"featured_media":4052,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/4051"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=4051"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/4051\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/4052"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=4051"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=4051"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=4051"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}