Codecarbon is a popular open-source python library for measuring carbon emissions. One of the things I liked about codecarbon is their dashboard. The carbon emissions estimation is done by measuring the power consumption of the total GPUs, CPUs, and RAM. Then it applies the regional carbon intensity of electricity of your cloud provider or country if you are using a local machine or on-premise cluster.
A sample ML model usually requires dataset upload -> splitting training/testing data -> train an ML model and evaluate it for accuracy. So what if I want to see carbon emissions while performing these tasks? How can I do it? The following lines of code will explain how to do that. I have uploaded the code here. I ran the code on Watson Studio but this can be calculated for any environment. I am using german credit risk dataset to train a SGDC Classifier based ML model.
# import codecarbon library to calculate emissions
from codecarbon import EmissionsTracker
from codecarbon import OfflineEmissionsTracker
# initiate tracker
tracker = OfflineEmissionsTracker(country_iso_code="USA")
tracker.start()
# ML code
train_data, test_data = train_test_split(data_df, test_size=0.2)
features_idx = np.s_[0:-1]
all_records_idx = np.s_[:]
first_record_idx = np.s_[0]
string_fields = [type(fld) is str for fld in train_data.iloc[first_record_idx, features_idx]]
ct = ColumnTransformer([("ohe", OneHotEncoder(), list(np.array(train_data.columns)[features_idx][string_fields]))])
clf_linear = SGDClassifier(loss='log', penalty='l2', max_iter=1000, tol=1e-5)
pipeline_linear = Pipeline([('ct', ct), ('clf_linear', clf_linear)])
risk_model = pipeline_linear.fit(train_data.drop('Risk', axis=1), train_data.Risk)
# stop the tracker and print emissions
emissions: float = tracker.stop()
print(emissions)
Once I ran the above code, here's the output.
1. Start tracker
A tracker is a function which should be initiated to start the tracking of emissions. All the code that is written after starting the tracker will be tracked for emissions. The following output shows when a tracker is initiated with the required parameters. This is done by initiating the EmissionsTracker method if it's online mode or OfflineEmissionsTracker method if it's offline (when not connected to internet) tracking. Offline mode can also be used if you want to give manual parameters to calculate carbon emissions using a specific region's data.