One Hot Encoding

Hey Community i hope you’re doing fine, i have data frame with more than 7000 rows and a 4 columns, and the label column with categorical labels and i want to convert them to numerical labels, and i don’t know how to do it!
is there any tensorflow function that can do this?
any help please.

1 Like

You’d have a couple of options here. Since you’re working in a dataframe, there’s a function in Pandas called get_dummies that will one-hot encode for you.

There’s also a function in Keras called to_categorical, though that seems like it’s limited to taking integer values as your classes. Those integers could be the indexes for a list of class name strings if you want to convert back to meaningful names.

Here are some links to the documentation pages for those functions.
https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html
https://www.tensorflow.org/api_docs/python/tf/keras/utils/to_categorical

3 Likes

Hey @Jeff_Corpac thank you for your reply, but the function to_categorical does not accept categorical classes like (red, green, blue…) that’s my case! :frowning:

1 Like

Do you want something like this?

2 Likes

You might want to try the Pandas function instead. You mentioned using a DataFrame in your original post so I figured you were using Pandas.

You can also set up a list of class names like ['red', 'green', 'blue'] and use the index (0 = red, 1 = green, etc) as categories for the to_categorical function. Since the categorical column is your label, you could also try using sparse_categorical_crossentropy as your loss function and use the index values as your labels to skip the to_categorical function altogether.

2 Likes

Thank you so much, you guys really saved my life!
I read the article shared by @Bhack, and i just applied the advice given by @Jeff_Corpac and it worked yaaay :grin:

1 Like