Help with continuous + categorical features

Hey all!

First project using Tensorflow, so total beginner. I am trying to predict the price of a car listed on Craigslist based off a few variables. Some of those variables are continuous (like milage), and some are categorical, like the title status.

I am following the basic regression tutorial because it is doing something nearly identical.

My problem is when they set up the normalization layer, it seems to be applying the normalization to the category variables as well. This doesn’t seem right, because the true/false variables probably shouldn’t be normalized.

I realize I am new enough to tf that I might not be asking the right question - but is there a way to only normalize some of the columns? Do I have to add a different layer that does no modification, but if so, how do I get the input size of the model to line up?

If anyone could point me towards the right direction - it would be much appreciated!

It is good practice to normalize features that use different scales and ranges because the features are multiplied by the model weights. So, the scale of the outputs and the scale of the gradients are affected by the scale of the inputs.

There is no advantage to normalizing the categorical (i.e one-hot) features it is made in the tutorial for simplicity. Please refer to the Classify structured data using Keras preprocessing layers tutorial for your use case. Thank you.

1 Like