Training a Neural Network Can Emit More Than 600,000 Pounds of CO2. But Not for Long.

A new technique for training and running a neural network, proposed by researchers at MIT, has a much smaller carbon footprint

Training and running A.I. at scale requires server farms filled with thousands of computer servers. It takes a massive amount of electricity — and, in most cases, a lot of carbon emissions — to power these operations.

In June 2019, researchers from the University of Massachusetts, Amherst found that, for instance, training and running a natural language processing A.I. model — used to process and manipulate human speech and text — can emit more than 626,000 pounds of carbon dioxide. That’s almost five times the amount of carbon dioxide emitted by the average car during its lifetime. Now, a new paper proposes a way to reduce those emissions.

The paper, published by researchers from the Massachusetts Institute of Technology (MIT) in April, outlines a new technique for training and running a neural network, or a set of algorithms loosely modeled after the human brain that are used to perform natural language processing and interpret other types of data. They say their method uses about 1/1,300 of the carbon emissions it takes to train and run the neural networks being used today.

“I was pretty surprised by the amount of CO2 emissions that modern deep neural networks [a specific type of neural networks] have to [use],” Song Han, PhD, assistant professor of electrical engineering and computer science at MIT, tells OneZero.

One reason for the high CO2 emissions is that while nearly every modern device has a computer chip running it — from refrigerators to smartphones to data center servers — these chips are all different, with varying computing power and use cases.

To get A.I. algorithms on each of these devices, software-makers have to build different versions of the same algorithm to handle the idiosyncrasies of each piece of tech. That process of training the same algorithm to perform the same task on different devices over and over again is incredibly inefficient in terms of power consumption, and produces more CO2.

The MIT team’s approach is more of a Swiss army knife. Their algorithm is capable of adapting to hardware, no matter if it’s a fridge or a smartphone. In a test, the researchers were able to get their single algorithm to perform similarly when used on different phones and different types of processors, including GPUs, CPUs, and those specialized for servers. In the past, each device would need an algorithm custom-built for it. Using this new system, the amount of carbon dioxide used to train A.I. can go from hundreds of thousands of pounds to hundreds of pounds, the researchers say. The previous way is a “huge waste of energy,” explains Han, one of the authors of the paper outlining the more energy-efficient neural network.

“A larger amount of data means a larger amount of training that we need to do… and there could be a huge amount of resources required for that training.”

“[Teaching] a neural network to perform machine translation… not only costs a lot of CO2 emissions, but, if you convert to U.S. dollars, it costs between $1 million and $3 million,” he says. “Both economically and environmentally this is pretty expensive and it’s not sustainable.”

The MIT system, called the “Once-For-All Network,” has attracted the attention of other computer scientists in the field for its energy efficiency. In February, it won first place in the object detection and image classification categories at the Low-Power Image Recognition Challenge, a competition sponsored by the Institute of Electrical and Electronics Engineers that aims to accelerate the development of energy-efficient A.I.

Yiran Chen, PhD, an associate professor of electrical and computer engineering at Duke University and a co-organizer of the challenge, says he’s enthusiastic about the approach.

“It’s certainly a very interesting idea,” Chen tells OneZero. “Just training [a network] once and then you’re able to deploy to the different devices. If it can really do this, it will save a lot of time and energy, that’s for sure.”

Next, the MIT researchers will hone the new system to work faster on mobile devices and expand its capabilities so it can be used to train microcontrollers and microprocessors.

A.I. development does have the potential to aid in a response to climate change. In a paper published in November 2019, an international team of researchers suggested 13 ways that A.I. can help people adapt to climate change and mitigate its impacts. For example, the researchers claimed A.I. can help us forecast supply and demand for renewable energies or plan large-scale carbon sequestration projects.

But, according to the Amherst study, A.I. is likely overall speeding up climate change by increasing our carbon footprint. The need for more energy-efficient methods only grows as the world creates more data because certain types of A.I. are used to structure and categorize all of this information.

A 2018 white paper from International Data Corporation predicted that there will be 175 zettabytes of data in the world by 2025, 175-times as many bytes of data than there are stars in the observable universe. All of that data needs to be processed and all of that processing requires energy.

“A larger amount of data means a larger amount of training that we need to do… and there could be a huge amount of resources required for that training,” Han says. “So, I think it’s very crucial that we look at the environmental sustainability perspective of A.I.”

Drew Costley is a Staff Writer at FutureHuman covering the environment, health, science and tech. Previously @ SFGate, East Bay Express, USA Today, etc.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store