Test basic understanding of computer vision

Hi there,

I just started learning ML and have been watching some videos on TensorFlow. I just wanted to verbalize what I have learnt in the hopes of making sure I understand the concepts correctly. I would appreciate feedback on my thought process.

In YouTube video Basic Computer Vision with ML (ML Zero to Hero - Part 2) with Laurence Moroney, we explored classifying images.

The general idea behind doing this I got was as follows:

  1. We take a dataset and associated labels
  2. We feed a point of data (in this case, an image) into a neural network
  3. In this case, we use 128 functions to process the input data and try to achieve the
    data points associated label. The example used was an image of an ankle boot and
    the number 9 as a label.
  4. The functions start off at some random position, and output some number given the
    input data, check it’s congruency with the associated label and then it passes
    that information to the optimizer.
  5. The optimizer decides how to change the rules in the functions before starting another epoch
    and trying again.
  6. The process repeats itself until all epochs have elapsed, getting better each time.
  7. Each output from the model is a value between 1-10, and in this case, we have 10 clothing items
    so it would be 10 instances of probabilities between 1-10 (this is converted to 1 for ease of use)
  8. This essentially means the model is trained and now when we feed images in, it can give us a
    probability that this image is something, by using model.predict(my_images) (although I’m still not
    sure how this method works).

Thanks for reading, any help appreciated.