tvix/store/grpc: gRPC backend panics due to nested async runtimes

#304
Opened by cbrewster at 2023-09-10T18·45+00

If you try to use a gRPC server as the backend for the blob service when running the daemon, blob services requests will cause a panic due to calling block_on in a thread that's already being used by an async runtime.

STR:

thread 'tokio-runtime-worker' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.', store/src/blobservice/grpc.rs:141:33

A non-elegant solution, but one to get it working with minimal changes, is to wrap all these calls with block_in_place which will force all other async tasks to move to a different thread. This then allows the usage of block_on.

Flokli mentioned today that switching the store service traits to async may be feasible, since its more isolated from the eval system now, which would make this a non-issue.

  1. cl/9293

    cbrewster at 2023-09-10T19·08+00

  2. Decided to take this in a different direction and use spawn_blocking consistently when calling (potentially slow) sync code from an async context. This will not only fix this issue but give more predictable performance of the async runtime since we will no longer lock up some of the async runtime threads if these sync calls are slow for some reason.

    cl/9294

    cbrewster at 2023-09-10T21·40+00

  3. I opened b/306 regarding the async trait situation.

    flokli at 2023-09-11T15·31+00

  4. This should be covered with https://cl.tvl.fyi/c/depot/+/9329.

    flokli at 2023-09-15T14·29+00

  5. flokli closed this issue at 2023-09-18T14·30+00