Tensorflow plugin vs XLA backend

Hi,

I have been working on pluggable device following this tutorial to integrate a new accelerator.
I just discovered the entry for developing a XLA backend in the Tensorflow documentation and I am now a bit confused as these seems to be two different ways to integrate a new device.

What are the different use cases?
What method is recommended/should be privileged?

Can anyone explain to me the difference?

Follow the replies to my post:

One important difference is that Tensorflow device plugin is realted to TF only and XLA/OpenXLA is multi frontend (Jax, TF, Pytorch, …):

1 Like

Thank you @Bhack for the quick answer!

What are the different use cases?

I can give more details:

  1. Framework coverage: As @Bhack said, PluggableDevice is a TF-specific solution, while OpenXLA supports multiple frontends: TF, JAX, PyTorch, and possibly more.
  2. Modularity:
    • PluggableDevice is modular, meaning that you can work on your own repository, release your own plugin package to work with the standard TF package
      • For example, users call pip install my-tf-plug-in on top of normal pip install tensorflow.
      • See more details on advantages of modularity in the PluggableDevice blog post
    • XLA currently doesn’t have C API for modular backends yet, so you will have to add code to XLA, and build and release your own “framework+XLA” binaries.
      • For example, users will have to call pip install tensorflow-with-my-xla-backend instead of the standard pip install tensorflow.
      • You might also need to ask PyPI to increase the standard package limit size for you since TensorFlow’s binary size is already over the standard limit by itself. (Your custom TF package would have the size of TF + your device backend.)
      • Device C API for XLA could be a discussion topic in SIG OpenXLA in the future.
  3. Op fusion: XLA can support more aggressive op fusion compared to PluggableDevice.
    • Vanilla TensorFlow performs limited op fusion (fused matmul and conv) in grappler remapper pass.
    • PluggableDevice could support new fusion patterns through custom graph rewrite pass and custom op kernels, but it would be hard to match XLA’s subgraph clustering.
  4. Op coverage:
    • You may have to add more C API for PluggableDevice to support ops with DT_VARIANT data type, e.g., Variable, TensorList, AddNVariant & ZerosLikeVariant ops, etc.
    • You don’t have this problem with XLA because it will be in the same build.
  5. Development status:
    • PluggableDevice is in semi-maintenance mode: We only add new features when necessary (e.g., to expand support for important ops).
    • OpenXLA has been under active development. XLA will eventually be moved out of TensorFlow to its own repository, but we will try to minimize disruptions caused by the move.

What method is recommended/should be privileged?

It depends, based on the trade-offs listed above. If your goal is to support multiple frameworks in one go, I think XLA backend would be the best choice. (JAX doesn’t have custom kernels and only uses XLA.)

2 Likes

@penporn I suppose that a third theoretical alternative is to contribute another backend to the new TF runtime but I don’t think we are ready to accept contirbutions in this repository:

1 Like

@Bhack The TFRT backend support is still experimental/tentative. We recommend adding support for a new device using PluggableDevice or by building out an XLA backend. Both approaches will have better long-term support from the TF ecosystem. If you’re comfortable with compiler tooling, XLA has the advantage that it can also integrate with other frameworks such as JAX & PyTorch.

1 Like

Thank you for the detailed answer.

Is it possible yet to get a rough time estimate for OpenXLA moving out of TensorFlow and for XLA having modular backends?