I say this as someone who has been in deep learning for over a decade now: this is pretty wrong, both on the merits (data obviously lives on a manifold) and on its applications to deep learning (cf chris olah's blog as an example from 2014, which is linked in my post -- https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/). Embedding spaces are called 'spaces' for a reason. GANs, VAEs, contrastive losses -- all of these are about constructing vector manifolds that you can 'walk' to produce different kinds of data.
If data did live on a manifold contained, e.g. images in R^{n^2}, then it wouldn't have thickness or branching, which it does. It's an imperfect approximation to help think about it. The use of mathematical language is not the same as an application of mathematics (and the use of the word 'space' there is not about topology).
You're citing a guy that never went to college (has no math or physics degree), has never published a paper, etc. I guess that actually tracks pretty well with how strong the whole "it's deep theory" claim is.