Help understanding EfficientDet-lite output classes

I think I might misunderstand EfficientDet-lite output. I’ve been working with the salad detector example and have been able to train it to recognize my object. However it also detects a bunch of other things and classifies them as my object. For example if I show it a picture of my object and a person, it will outline the person, their hand, and my object and all are classified as my object. I would have thought it would say they were a person.

Now I’m starting to think I’m wrong. And that since I trained it with only my object, then that is the only object it understands. I’m not sure if I’m right about this though. So then I went and just ran the Salad detector example from the beginning with no changes. And I took that model and ran it against my person + my object photo. This time in recognized only the person’s shirt and the only classes shown were from the parts of a salad.

So I guess maybe I misunderstand how EfficientDet-lite is supposed to work? Maybe I need to train it with a lot more data about my object so I don’t get so many false positives?

Thanks!