Dummy Pluggable Device/API Reference for implementing OpenCL backend

I want to add support of OpenCL to TF.

I have working set of DL operators (cudnn/miopen like) and mini framework with relatively good performance: GitHub - artyom-beilis/dlprimitives: Deep Learning Primitives and Mini-Framework for OpenCL

It gets comparable performance to TF/cuda (75% for training and 90% for inference), and vastly outperforms existing OpenCL solutions like PlaidML or Caffe-OpenCL: DLPrimitives Blog

I discovered that there is what is called pluggable device, however I want to find some API reference and best if there some basic dummy device I can start with (apu as you had written).

Is there anything like that?