Skip to main content

Executing a rollback

Instructions for rolling back a code change

In the case of an incident, we may desire rolling the node software back to the previous release. If the issue is not urgent (the nodes are still generally functional) it is recommended to fix-forward rather than rollback, because of the caveat listed at the bottom of this document. But if you really need to roll back, these are the steps to take.

  1. Find the sha256 of the previous release's Docker Image. This can be found by viewing the logs of previous Github Action Runs. For dev rollbacks, this will be an action on the dev branch and for production rollbacks this will be for the main branch. The action is found in the Deploy step of the "Push to Registry And Deploy" action. The log line will look something like Successfully pushed xmtp/node-go@sha256:e1f10d98411141b173ef7a5b41e2a7b16a8ab95eda74b559bea2d2a43e477fdf
  2. Go to the Variables page of the Terraform Workspace for the environment you are trying to roll back (either dev or `production)
  3. Edit the xmtp_node_image variable and paste the full image name from step 1
  4. Click the Actions button and select Start new run
  5. Go to the Runs tab of the Workspace and approve the run

Instructions for rolling back an Infrastructure change

A change to the infrastructure repo may also need to be rolled back. These issues must be fixed forward (by either reverting the last PR or creating a new patch PR), merging to main, and approving the run in Terraform Cloud

Caveats

  • Executing a rollback through Terraform will not freeze future deploys. If another PR is later merged to main in xmtp-node-go, it will overwrite your rollback and push whatever code is in main back to the deployed environment. Because of this, any rollbacks will likely need to be paired with a revert PR or the problem will return later.