I call shenanigans. Find me a place where that algebra, which lacks standard rules of common algebra, makes sense.
Consider, king + woman = queen + man. Which looks neat, but is not a universal truth. It could be concubine, for example.
So, is queen + man also concubine?
Again. I'm glad this works for some things. But really just shows which words are often used together. It does not show any good rationale for their meanings. Unlike math, where 1 + 1 equals 2. Possibly in different encodings. But not just from convention of often being used together.
The answers you are looking for are Nobel-worthy. I have not formally studied linguistics, but you might wanna start there. Regarding the king equation, I think "king" is actually a vector in this context, as with the other objects. If so, then it is using the vector space formalism.
That said, I definitely agree with you. An English speaker may find any of these reasonable:
1. King - man = expensive clothing
2. King - man = prince
3. King - man = queen
What is "king"? What is "-"? What is "man"?
If a king is a dressed up wealthy man, and you remove the man, you have wealthy clothes? Or, does removing the man mean degrading the king back into a boy? Or, does removing a man mean adding a woman? Wait -- what if a king is more than a dressed up wealthy man? Should we include his home? Do we need to subtract the home? How do you subtract a home? Is the king minus a man a prince if the king was a beggar when he was young? ... death of the universe ...
Like you said, there's a combinatoric explosion here. Maybe this example is akin to trying to model each and every trajectory of all 10^23 particles in a gas. It looks like these scientists are stepping back, and looking at the big picture, instead, trying to find something more akin to PV = NkT
Precisely. Models which use this space do not propose strong equality (==). Rather, they would output a series of probabilities, and choose the most likely. Stating king + woman = queen + man is somewhat disingenuous; what should be said (mathematically) is something like the following: the word lying closest to the vector vec('king') + vec('woman') - vec('man') is 'queen'.
To suggest that a NN can't learn something about the meaning of words from a large corpus of text is unsubstantiated, I believe. The statement above suggests they do, I would say. I would not be too surprised if a sufficiently complex NN could 'learn' the concept of gender with a corpus of English text to a decently high degree of accuracy, simply based on vestigial features left from French and Old English.
It makes more sense to sort of factorise each thing into its different components. King is a masculine monarch. Man is a masculine person. Queen and woman are the feminine equivalents.
So king + woman = queen + man is better described as:
Masculine monarch feminine person = feminine monarch masculine person
A bit of reordering of adjectives and it is exactly the same. Even monarch is the wrong word, because you seem to be getting hung up on nouns, when these are all actually a bunch of chained adjectives. Perhaps "regality + nobility + rulery". English is a bad language to describe this, because we tend to noun and verb our adjectives regularly.
Consider, king + woman = queen + man. Which looks neat, but is not a universal truth. It could be concubine, for example.
So, is queen + man also concubine?
Again. I'm glad this works for some things. But really just shows which words are often used together. It does not show any good rationale for their meanings. Unlike math, where 1 + 1 equals 2. Possibly in different encodings. But not just from convention of often being used together.