Ground truth in this case means images with labels of some sort on parts of the image or objects in the image, not that they're real images themselves. So a cat in a photo with a label or mask on the cat would be "ground truth" on what part of the image a cat is in, or that there is a cat.
Haha, just pointing out how the term is used in the industry. If you want to really play with semantics then I would argue that even a photo of a real cat isn’t a real cat and thus not ground truth. But while the consequences of that might have the awesome effect of clearing the shelters, it’s not practical, so we just call any labeled pixels representing a cat to a fidelity good enough for our purposes ground truth. ;)