Deformable convolution and other custom ops

Recently we had a refresh over a Deformable convloution WIP PR in Addons.

I’ve cherry-picked this as an example as this requires us to maintain almost 3k lines of new code in the repository.

This is maintainer-ship overhead is also quite similar to what we have with other custom kernels PRs.

As Addons is one of the few Ecosystem repositories to support custom (c++) ops and the related CI infra it is quite normal that we have this kind of proposed PRs.

But as the codeownership of these components it is generally not so stable over time we would like to not merge, as possible, these custom ops PRs also to achieve a more broad hardware coverage.

What are the alternatives? How we could collaborate when a compositional implementation has huge performance gaps?

Often this kind of issues are shared across the “extend” ecosystem like e.g. for the EmbeddingBag:

EmbeddingBag op and layer by Rocketknight1 · Pull Request #2352 · tensorflow/addons · GitHub (1k lines)

Thanks,
Stefano

1 Like

@kristen Is the MLIR team registered to this Dscourse instance or are they only in the LLVM MLIR discourse instance?

Cause generally we don’t have TF specific threads in the LLVM MLIR instance.

1 Like

They have been invited here too

1 Like

Ok I’ve cross posted in the MLIR llvm forum instance.

I hope that at least some TF-MLIR team members could be subscribed to their tags and subcategory.

/cc @Jacques_Pienaar let me know if you want to move this in in another category and you want to use only the XLA tag.

Hey Stefano,

Here is fine thanks (all necessary tags). I’m pinging a couple of folks who has been looking at interfacing/third party backends as i don’t think they’ve seen this yet.

Best,

Jacques

[I’ll speculate based on previous conversations while we wait]

One of the parts we have discussed is “keeping” multiple levels of abstraction around, enabling backends to hook/match at appropriate level to enable the “mega” op while exposing the decomposed forms where there is no support. It is also true that the compositional representation has been too rigid and hasn’t composed as well (“just rewrite your computation as convolutions if you want performance” being in effect the indirect suggestion) and should be revised (which is happening albeit slowly). These are great examples to highlight - a common problem is that folks find a case where compositional form does poorly, special cases a transformation and then moves on and without such overarching examples it is easy to miss that the problem isn’t being addressed.

1 Like

IMHO this is exactly the point.
And I think it is why some specific reusable components ( keras-nlp, keras-cv, tf-addons) that are serving e2e models, also in our selected models in model garden, could be one of the driver to understand what we are expecting from the compiler stack.

Just take a look at our current threshold in TF addons:
we need at least >50 citations to accept a feature related to a paper so it is not something that it is totally brand new.

If we need to have a custom c++ op to reach good enough performance for a new layer but then the codeowner disappear after one or two months or people require to use it in Colab/Google cloud TPU isn’t better to try to interact with these use cases directly with the compiler stack team to understand a little bit how to handle our end2end performance request and to better evaluate a solution that it is alternative to maintain a partial hw covered large custom op?

Just my 2¢

We could see the same in Keras as now it is again a python only repo:

@Jacques_Pienaar Any news? I would like to keep this thread alive :wink:

/cc @yarri-oss @thea

Not yet (I have a meeting soon that is semi-relevant, but higher level and a couple next week where I could raise it again). There are a few efforts I’m aware of, but they are at various stages.

I do like driving these with specific components. I would also ideally have it be such that the compiler team need not be a bottleneck here as that also doesn’t scale. And I believe separable convolutions have been on your list for a long time :slight_smile:

1 Like

Thank you, help me to keep this thread alive

Just a keep alive message for this thread.

Can we find someone in the TF or MLIR team that can give us any feedback/roadmap or just a rough outlook on this topic?

Thanks