Auto-deploy / self-healing mechanism for whitby (b/109 AI)

Opened by tazjin at 2021-04-09T19·02+00

For auto-deploy, I propose something like:

For emergencies / development workflows:

  1. tazjin updated the body of this issue at 2021-04-09T19·02+00
  2. I think we should also have some sort of post deploy smoke tests that trigger an automatic rollback - could start out with something as simple as failed systemd units and go from there

    grfn at 2021-05-22T18·05+00

  3. I'm going to start working on this

    grfn at 2021-05-23T15·29+00

  4. Start of a script is up at cl/3145, pausing there for review

    grfn at 2021-05-23T16·58+00