Suggestions regarding a `tf.data` pipeline

Sayak_Paul · March 24, 2021, 7:55am

I am currently using the RandAugment class from tf-models (from official.vision.beta.ops import augment). The RandAugment().distort(), however, does not allow batched inputs, and computation-wise it’s expensive as well (especially when you have more than two augmentation operations).

So, following suggestions from this guide, I wanted to be able to map RandAugment().distort() after my dataset is batched. Any workaround for that?

Here’s how I am building my input pipeline for now:

# Recommended is m=2, n=9
augmenter = augment.RandAugment(num_layers=3, magnitude=10)

dataset = load_dataset(filenames)
dataset = dataset.shuffle(batch_size*10)
dataset = dataset.map(augmenter.distort, num_parallel_calls=AUTO)

Bhack · March 24, 2021, 11:36am

Yes the issue is that It seems to me that we have also duplicated OPS like e.g. cutout not batched in official.vision namespace and batched in TFA.

These are the origins of the current status:

github.com/tensorflow/addons

Migrate AutoAugment and RandAugment to TensorFlow Addons.

opened 03:57PM - 05 Mar 20 UTC

closed 12:54AM - 28 May 20 UTC

dynamicwebpaige

help wanted image Feature Request

**Describe the feature and the current behavior/state.** RandAugment and AutoAu…gment are both policies for enhanced image preprocessing that are included in EfficientNet, but are still using `tf.contrib`. https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/autoaugment.py The only `tf.contrib` image operations that they use, however, are [rotate](https://www.tensorflow.org/addons/api_docs/python/tfa/image/rotate), [translate](https://www.tensorflow.org/addons/api_docs/python/tfa/image/translate) and [transform](https://www.tensorflow.org/addons/api_docs/python/tfa/image/transform) - all of which have been included in TensorFlow Addons. **Relevant information** - Are you willing to contribute it (yes/no): No, but am hoping that someone from the community will pick it up (potentially a Google Summer of Code student)? - Are you willing to maintain it going forward? (yes/no): Yes - Is there a relevant academic paper? (if so, where): AutoAugment Reference: https://arxiv.org/abs/1805.09501 RandAugment Reference: https://arxiv.org/abs/1909.13719 - Is there already an implementation in another framework? (if so, where): See link above; this would be a standard migration from `tf.contrib`. - Was it part of tf.contrib? (if so, where): Yes **Which API type would this fall under (layer, metric, optimizer, etc.)** Image **Who will benefit with this feature?** Anyone doing image preprocessing, especially for EfficientNet.

github.com/tensorflow/community

Ask contribution to Tensorflow addons for general scope utils, loss, layers, ops

opened 08:53PM - 31 Mar 20 UTC

closed 12:39AM - 01 Feb 22 UTC

bhack

As we have just refreshed the model repo as model garden I would enforce the con…tributions policies of generale use (or already established in literature utils, losses, layers, ops to be contribute more systematically in Tensorflow/addons instead to be embedded or duplicated in the model repos. E.g: https://github.com/tensorflow/addons/issues/361#issuecomment-606856705 https://github.com/tensorflow/addons/issues/1364#issuecomment-602287778 https://github.com/tensorflow/addons/issues/1366 https://github.com/tensorflow/addons/issues/1226#issuecomment-599483920 https://github.com/tensorflow/addons/issues/1226#issuecomment-600204811 /cc @ewilderj @facaiy @seanpmorgan # Policies to be enforced with a PR https://github.com/tensorflow/models/blob/master/CONTRIBUTING.md https://github.com/google-research/google-research/blob/master/CONTRIBUTING.md https://github.com/google/automl/blob/master/CONTRIBUTING.md

Sayak_Paul · March 24, 2021, 12:04pm

So, currently, no workaround, right?

Bhack · March 24, 2021, 12:40pm

My opinion is that we need just to see how we want to standardize our image processing OPS in the ecosystem. I think these duplicates are going to create confusion.