In theory, I like your idea, but in practice, I think corporations large enough to run experiments like that are not able to objectively evaluate anything they do, at least not officially. It's too political.
I've been in dozens of interviews as an interviewer and worked with at least a dozen people after seeing how they performed in interviews, so that's what I go on.
The big software firm I worked at (not FAANG but we competed with them for talent) absolutely tweaked its interview processes based on long run outcomes.
FAANGS evolve their process over time too.
One clear example - at FAANGS even if the manager likes you, it doesn't matter. You have to be evaluated by a committee of people who're indifferent to your manager, aren't under the gun to fill a slot and are less likely to be swayed by a one-time positive conversation.
Do you think this process came out of nowhere? That's just one example of them settling on something that worked, I am sure having previously tried other less structured approaches and tracking the results.