How to Sort a

I have a that contains features and a probability. (I created the dataset by zipping my test dataset with the probabilities predicted by my binary classification model, thereby adding a probability “column” to the test dataset.)

I want to sort this dataset in descending order by probability. Can I do so directly, without resorting to converting the dataset to numpy or a pandas dataframe?

if you want to do visualisation, I’d suggest you do with numpy (using something like dataset.as_numpy_iterator()), it will be the easiest path

Thanks, but is there really no way of working more directly on a and thereby maintain the lazy evaluation, caching, and consistency that datasets afford?

Are your sorting needs for balancing like:

My needs resemble those described in the thread you referenced but are more centered around sorting. Using a pandas dataframe, I can do:

top_scored_test_data = scored_test_data.sort_values(by = 'prediction', ascending = False)[:10]

Being able to do something similar with a would be convenient and potentially not require loading all the data into memory at the same time (or not require loading and sorting it until it is actually used).