arb_'s comments

arb_ · on Dec 20, 2024

I think they tweet prototypes and then don't ship most prototypes. Seems reasonable to me, otherwise you have instant bloat.

arb_ · on Sept 13, 2024

As models improve, human preference will become worse as a proxy measurement (e.g. as model capabilities surpass the human's ability to judge correctness at a glance). This can be due to more raw capability - or more persuasion / charisma.

arb_ · on Sept 13, 2024

Empirically, they have reduced hallucinations. Where do OpenAI / Anthropic claim that their models won't hallucinate?

flanked-evergl · on Sept 14, 2024

One example:

https://www.theverge.com/2024/3/28/24114664/microsoft-safety...

> Three features: Prompt Shields, which blocks prompt injections or malicious prompts from external documents that instruct models to go against their training; Groundedness Detection, which finds and blocks hallucinations; and safety evaluations, which assess model vulnerabilities, are now available in preview on Azure AI.

simonw · on Sept 14, 2024

That wasn’t OpenAI making those claims, it was Microsoft Azure.

flanked-evergl · on Sept 17, 2024

I never said it was OpenAI that made the claims.