Possibility to differentiate each person voice using machine learning

I have a question, is there any already known research about diiferentiate each person voice. For example, based on the
input data -
person A and person B, saying the same group of word,

Output data -
person A voice pattern and person B voice pattern

Apply the model with pattern to determine is that person A that saying the testing word or person B that saying the testing word.

This is my question, I just done 2 simple image classification model with neural network so I wonder if there are similar approach to this subject.