As models improve, human preference will become worse as a proxy measurement (e.g. as model capabilities surpass the human's ability to judge correctness at a glance). This can be due to more raw capability - or more persuasion / charisma.
> Three features: Prompt Shields, which blocks prompt injections or malicious prompts from external documents that instruct models to go against their training; Groundedness Detection, which finds and blocks hallucinations; and safety evaluations, which assess model vulnerabilities, are now available in preview on Azure AI.