nix: build user fix leads to occasional deadlock of some kind

#185
Opened by tazjin at 2022-06-24T11·17+00

In our backported build user fix (i.e. wait for build users instead of hard-failing), we see occasional deadlocks in situations where the daemon gets many simultaneous builds from different clients.

This happens in CI for example on channel bumps.

Next steps:

[1:24:56 pm] <sterni> tazjin: like left better i think
[1:27:04 pm] <sterni> tazjin: btw worrying this build got stuck for ever https://buildkite.com/tvl/depot/builds/14234#01818804-e8d5-40a7-9d1f-a14b357e0906
[1:27:10 pm] <sterni> not sure if nix or some inner build
[1:27:13 pm] <sterni> we'll find out I guess
[1:34:50 pm] <tazjin> sterni: that's a bug in the waiting for build users thing
[1:35:02 pm] <tazjin> sterni: some internal mutex ends up poisoned somehow
[1:35:15 pm] <sterni> 🤠👍
[1:35:17 pm] <tazjin> it's actually the same thing that happens in Nix >=2.4, so we backported that :p 
[1:35:43 pm] <sterni> is it fixed in some higher nix version?
[1:35:46 pm] <tazjin> sterni: I was thinking we should probably have a default timeout for all our targets that can be overridden