NeuCLIR

Logo

Official website for the NeuCLIR track at TREC 2024.

View My GitHub Profile

Tracking Carbon Emissions for your NeuCLIR Submission

NeuCLIR aims to be the first carbon-neutral track at TREC. In order to do so, we are encouraging all participants to track and report carbon emissions they generated while preparing their submission.

We also encourage participants to buy offsets for the emissions they produced; it is more affordable than you think!

On this page, we provide resources that explain both how to track and buy carbon offsets; Offsetting your emissions is cheaper than you think! For example, 2 weeks of compute on a p4d.24xlarge AWS instance in us-east-1 emit ~250 kg CO2 eq., which can be offset by as little as 4 USD when donating to UN’s Efficient Cook Stove Programme in Kenya.

Background

Most aspect of submitting a run to a TREC track generate carbon emissions: from the resources used to train neural models, to the power required to transfer data over the internet, to the energy used by the device you are using right now to read this page! However, the energy used for model training and parameter search is typically responsible for the bulk of carbon emission generated (Strubell et al., 2020).

Carbon emissions created during model training depend primarily on three factors (Luccioni 2021):

  1. Type of energy used: depending on the region where the infrastructure used to train a model is located, emissions can vary by more than 30 times (Lacoste et al., 2019); this is due to the mix of sustainable- and fossil fuel-derived evergy in each region. Models trained on devices that use reneweable energy will have a lesser impact on the environment.
  2. How long a model is trained for: models that are trained for longer time require more energy, thus leading to higher emissions.
  3. Hardware: different types of hardware accelerators (e.g., GPUs, TPUs) will have different power requirements. Recent architectures are typically more efficient, thanks to the inclusion of AI-specific components (e.g., NVIDIA’s Tensor Cores).

When possible, we recommend participants to choose location and hardware that have the lowest footprint. The calculator on this page can help assessing the best option.

Tracking Emissions

We recommend participants to use CodeCarbon, a python package designed to track emissions for machine learning experiments. It can be installed via:

pip install codecarbon

Once installed, the impact of an experiment can be estimated as follows:

from codecarbon import EmissionsTracker

tracker = EmissionsTracker(output_dir='co2')
tracker.start()

# Training loop and other gpu-intensive
# code goes here...

tracker.stop()

The code above will save a csv file in the directory co2 called emissions.csv. For more information about how to use CodeCarbon, please check out the official documentation.

In case experiments cannot be tracked with the package above, or participants have trouble setting up CodeCarbon, we recommend using this ML CO2 impact calculator.

Buying Carbon Offsets

Inspired by CILVR (2021), we recommend participants to consult the United Nations Carbon Offsets Platform to learn more on how to buy carbon offsets. This program highlights plenty of initiatives around the planet aimed at reducing carbon emission; each initiative offers a donation rate per tonne of CO2 emitted. After each donation, an official certificate of voluntary cancellation is issues via email.

References