In a major validation of their commitment to responsible technology, Mila, BCG GAMMA, Haverford College, and Comet.ml today released CodeCarbon, an open source software package to estimate the location-dependent CO2 footprint of computing. AI can benefit society in many ways but the amount of energy needed to support the massive computing behind it can come at a high cost to the environment.
Jointly developed by Mila, a world leader in AI research based in Montreal; GAMMA, BCG’s global data science and AI team; Haverford College in Pennsylvania; and Comet.ml, a leading MLOps solution provider, CodeCarbon is a lightweight software package that seamlessly integrates into Python codebase. It estimates the amount of carbon dioxide (CO2) produced by the computing resources used to execute the code to incentivize developers to optimize their code efficiency. It also advises developers on how they can reduce emissions by selecting their cloud infrastructure in regions that use lower carbon energy sources.
Yoshua Bengio, Mila founder and Turing Prize recipient, said of the software, “AI is a powerful technology and a force for good, but it’s important to be conscious of its growing environmental impact. The CodeCarbon project aims to do just that, and I hope that it will inspire the AI community to calculate, disclose, and reduce its carbon footprint.” Sylvain Duranton, a managing director and senior partner at Boston Consulting Group (BCG) and global head of BCG GAMMA, said, “If recent history is any indicator, the use of computing in general, and AI computing in particular, will continue to expand exponentially around the world. As this happens, CodeCarbon can help organizations make sure their collective carbon footprint increases as little as possible.”
Why Organizations Need This Tool Now
Training a powerful machine-learning algorithm can require running multiple computing machines for days or weeks. The fine-tuning required to improve an algorithm by searching through different parameters can be especially intensive. For recent state-of-the-art architectures like VGG, BERT, and GPT-3, which have millions of parameters and are trained on multiple GPUs (graphic processing units) for several weeks, this can mean a difference of hundreds of kilograms of CO₂eq.
Helping Organizations Live Up to Their Carbon Promises
The tracker records the amount of power being used by the underlying infrastructure from major cloud providers and privately hosted on-premise datacenters. Based on publicly available data sources, it estimates the amount of CO2 emissions produced by referring to the carbon intensity from the energy mix of the electric grid to which the hardware is connected. The tracker logs the estimated CO₂ equivalent produced by each experiment and stores the emissions across projects and at an organizational level. This gives developers greater visibility into the amount of emissions generated from training their models and makes the amount of emissions tangible in a user-friendly dashboard by showing equivalents in easily understood numbers like automobile miles driven, hours of TV watched, and daily energy consumed by an average US household.
Open Source and Community-Based
The ability to track CO2 emissions represents a significant step forward in developers’ ability to use energy resources wisely and, therefore, reduce the impact of their work on an increasingly fragile environment. The developers expect that CodeCarbon will also help introduce greater transparency into the developer community, enabling developers to measure and then report emissions created by an array of computing experiments. Jonathan Wilson, Associate Professor of Environmental Studies at Haverford College, said, “Computing’s carbon footprint depends on where the computations are performed, how much power is consumed, and whether fossil fuels or low-carbon sources generate that electricity. CodeCarbon will show you where to run your code to minimize your carbon footprint.” Niko Laskaris, data scientist, Comet.ml, said, "Our community needs to innovate more responsibly, and that starts with tracking and optimizing your model. With CodeCarbon, data scientists and teams can keep building great models, but with a new parameter: the carbon footprint of their work.”
The team creating this open source tool has also said they look forward to developers and researchers using it and contributing to it by enhancing it with new capabilities. To increase awareness of the environmental impact of computing, they strongly recommend that users report the CO2eq of their experiments in research papers, articles, and tech blogs.
The climate damage caused by greenhouse gas emissions is evident. The developers of CodeCarbon hope that a tool that measures the environmental impact of artificial intelligence computing will be one way to help reduce its carbon footprint.
More information about CodeCarbon can be found here.