Deep Learning for signal processing problem

Hi there,

i am working on a loudspeaker that can generate audible soundwaves in a special medium that is a fluid-gel. Since the equations for the signal processing part are not completely understood yet, i want to use a deep learning script to aid. It is not complicated the only problem is that i am very new to Tensorflow and have not really an idea how to implement it, although i have been working a week on it.

So my problem is this: the loudspeaker has to generate ultrasonic modulated signals, the medium is nonlinear and therefore demodulates it, it becomes audible again in the medium. I can make many samples of some random signals and recorded samples, or hook the PC onto the loudpeaker and microfone, so that tensorflow can “learn” itself.

I just have no clue how to do is, are there any sample projects you know? I am very thankful for any help

1 Like

I don’t know if you could find this useful for your project:

1 Like

Hi @Georgschmied

I’m not an expert but maybe this tutorial can give you some ideas to play with: Reconocimiento de audio simple: reconocimiento de palabras clave  |  TensorFlow Core

The main idea is, you get the sound waves from the medium, convert to a spectrogram, use a CNN to extract features and then to the classification.

1 Like

For learning about waveform/signal processing, you could start with the Sound of AI channel that teaches signal processing with Python and deep learning for audio with TensorFlow and Librosa. Some hands-on examples (in separate playlists):

  1. 16- How to Implement a CNN for Music Genre Classification - YouTube

  2. How to Implement Autoencoders in Python and Keras || The Encoder - YouTube

  3. How to Extract Spectrograms from Audio with Python - YouTube

GitHub repos by the same person:

  1. GitHub - musikalkemist/DeepLearningForAudioWithPython: Code and slides for the "Deep Learning (For Audio) With Python" course on TheSoundOfAI Youtube channel. ;
  2. GitHub - musikalkemist/generating-sound-with-neural-networks: Code and slides for the "Generating Sound with Neural Network" series on The Sound of AI Youtube channel. ;
  3. GitHub - musikalkemist/AudioSignalProcessingForML: Code and slides of my YouTube series called "Audio Signal Proessing for Machine Learning"

To learn to generate sounds using the input from your loudspeaker, you might start with the second playlist mentioned above (generating sounds with neural networks).

If you’re interested in music source separation, for example, there’s an open sourced model called Spleeter - you can try it on your machine: GitHub - deezer/spleeter: Deezer source separation library including pretrained models. (it was written in TensorFlow). (D3Net is one of the current state-of-the-art models: https://arxiv.org/abs/2010.01733 .)

More material:

2 Likes

Thanks for your valuable replies, i appreciate very much

1 Like

I looked at this option. It seems like this solution is too complex for what I need to do with my project. I am sure that it has all the functionality I need, but I am just trying to process XY data for the number peaks and where they are. I could spend a week understanding all the details of the Magneta project, but the time would be wasted, if at the end, I cannot get what I need.

If you just need a peak detection on a simple audio signal probably you have easier off the shelf solution like:

No, we need something better than that. I have developed our own modules that can detect the peaks and noisefloor features based on Principal Component Analysis. Actually that is the code that is being used to label the dataset. We need a DNN that can:

  1. Identify Harmonics, in the signal wave.
  2. Identify Non-Harmonics in the signal wave.
  3. Find Bessel functions in the signal wave.
  4. Identify carrier frequencies in the signal wave.
  5. Compare two signals to find differences.

The biggest issue is that we can ID parts using simple signal processing in the laboratory, but this has to go out in production and ultimately customer environments, where there will be no hand-holding. A neural network is the best solution because of its’ ultimate reliability. So we need a DNN that is trained to find these features.

I wanted to base this off of audio processing (since so much work has already been done). But it seems that ML for this kind of work only looks at a few solutions:

  1. XY data (none of the solutions I have seen look at signal data this way).
  2. Images of plots. (This appears more promising, but it is hard to see how we could train a network to “Look” at plots sucessifully).
  3. Spectrographs. All the audio processing examples I have seen look at audio files in this manner. I looked over Magneta and a few examples online. I don’t think we can turn our data into spectrograms. We have power v frequency plots (the data has already undergone an FFT).

Is there a DNN effort that has looked at ways to process Power Vrs Frequency plots in radio emissions?

Poking around the web I found this company in VA: https://www.deepsig.ai/. Looks like we have competition.

There was a nice survey for radio signals at:

https://arxiv.org/abs/2008.08264

1 Like

@Georgschmied In case you haven’t seen this: Spotify open-sourced a TensorFlow 2 library called Realbook: GitHub - spotify/realbook: Easier audio-based machine learning with TensorFlow.

Realbook is a Python library for easier training of audio deep learning models with Tensorflow made by Spotify’s Spotify’s Audio Intelligence Lab. Realbook provides callbacks (e.g., spectrogram visualization) and well-tested Keras layers (e.g., STFT, ISTFT, magnitude spectrogram) that we often use when training. These functions have helped standardized consistency across all of our models we and hope realbook will do the same for the open source community.

Realbook contains a number of layers that convert audio data (i.e.: waveforms) into various spectral representations (i.e.: spectrograms). For convenience, the amount of memory required for the most commonly used layers is provided below.