Recover from lost GPU device

I am attempting to get my web browser to gracefully handle and recover from a “GPU connection lost” error. I have been trying various implementations similar to the following, but have not yet landed on something that works.

const device = tf.backend().device;
  device.lost.then(async (info) => {
    console.error(`WebGPU device was lost: ${info.message}`, info);
    await device.destroy();
    await tf.disposeVariables();
    tf = null;
    await import(`${Math.random()}`);
    await tf.setBackend("webgpu"); 
    const device = tf.backend().device;

Issue is, no matter what I do, the device that is returned appears to not be connected and immediately is “lost”. I suspect the browser cache may be preventing me from re-initializing tfjs module. Is this the right approach? Is there some undocumented way to reconnect an existing tfjs instance to the GPU? I am hoping somebody out there might be able to point me in the right direction.

I am using tensoflow.js with the webGPU backend on the Chrome browser

This is an interesting question. If the browser itself has bailed due to out of memory etc then my thoughts are that a page refresh is the only way?

Thanks for the suggestion. I am hoping to avoid the refresh. But, if it is unavoidable, I am thinking I could try loading the tfjs instance into a worker thread where I could just refresh just that part of the app and keep everything else loaded. However, I don’t know enough about this to be sure that is going to work.

Understood. One of the joys of working at the forefront of technology is that you may be the first one to address such issues. In the past out of memory issues were less but as more folk try to bring LLM / Diffusion to browser they start finding new limits and need to check memory or such before trying things. I would love to hear how you get on for this one and if you end up making some generalizable library to help with such issues within the Web AI community