Apparent incompatibility between tf.data and tf.feature_column

Hello!

I’m learning about neural networks through the book “hands on machine learning with scikit learn keras tensorflow” and official documentation. I’m reading now “tf.data” and “tf.column_feature”, packages used to load big amount of data and preprocessing. The problem is in “tf.data: Build TensorFlow input pipelines  |  TensorFlow Core” it says I can construct a Dataset from data stored in memory or in one or more files using tf.data.TextLineDataset(), for example. Once create I can iterate it and I will see a list of tensors. Ok.

Now reading “tf.feature_column.numeric_column  |  TensorFlow Core v2.8.0” I can see that one of the arguments is “key” and the example is a dict like: data = {‘a’: [15, 9, 17, 19, 21, 18, 25, 30], ‘b’: [5.0, 6.4, 10.5, 13.6, 15.7, 19.9, 20.3 , 0.0]}. This way I can use the function like: a = tf.feature_column.numeric_column(‘a’).

As you can see in the offical example the value of the keys are lists of numbers, not tensors and when I create a Dataset I get a list of tensors. My questions is is there a offical way(package, class, function) to transform the items of a Dataset into a dic so I can use numeric_column() or categorical_column_with_identity(key) or I have to do that by myself?

Apparently there are incompatibility between them or maybe they are not suppose to work together. I don’t know. As I said I’m still learning.

Hi @Antonio_Caipora,

Welcome to the TensorFlow Team!

tf.feature_column.numeric_column API is deprecated and no longer used. Please use Keras API - tf.keras.utils.FeatureSpace for the same where various different features are available to do the dict mapping on dataset as required.

Thank you.