I’m new here but working with a group trying to detect a specific animal sounds in the wild using ML on ESP32 devices.
Initially I thought that repurposing the micro_speech TFlite-micro example might be a good approach. It has an example optimized for the ESP32 & its using a trained model to recognize specific sounds. However the more I’ve researched the more it appears that the micro_speech example is (naturally enough of course) very specific to keyword detection & that the FFT code in the example contains optimizations for human speech.
I’d appreciate if anyone could suggest a better starting point or approach for this particular application.