I don't understand why they don't let another model "test the waters" first to see if the output of the main model could have a potential legal issue or not. I think it's easy to train an model specifically for this kind of categorization, and it doesn't even require a large network, so it can be very fast and efficient.
If the "legal advisor" detects a potential legal problem, ChatGPT will issue a legal disclaimer and a warning, so that it doesn't have to abruptly terminate the conversation. Of course, it can do a lot of other things, such as lowering the temperature, raising the BS detection threshold, etc., to adjust the flow of the conversation.
It can work, and it would be better than a hard-coded filter, wouldn't it?
They already do this, it's the moderation model.[1]
This name thing is an additional layer on top of that, maybe because training the model from zero per name (or fine tuning the system message to include an increasingly big list of names that it could leak) is not very practical.
But how would that work reliably? If I make the statement that "David Mayer" is criminal, an international terrorist or a Nickelback fan, that's definitely libelous. But if I say those things about Osama bin Laden, they're just simply facts. [1]
The legal AI would be impossible to calibrate: either it has the categorize everything that could possibly be construed as libel as illegal, and therefore basically ban all output related to not just contemporary criminal actors, but also historical ones [2], or it would have to let a lot of things slip through the cracks -- essentially, whenever the output to validate suggests that someone's sexual misconduct is proven in court, it would have to allow that, even if that court case is just the LLM's halluzination. There's just no way for the legal model to tell the difference.
[1]: I could not find any sources that corroborate the statement that bin Laden is into Nickelback, but I think it follows from the other two statements.
[2]: Calling Christopher Columbus a rapist isn't libel, and conversely, describing him in other terms is misleading at best, historically revisionist at worst.
> [1]: I could not find any sources that corroborate the statement that bin Laden is into Nickelback, but I think it follows from the other two statements.
Pretty sure the literature makes it clear he's a fan of show tunes. So it's down to your conscience and moral backbone as to whether this is better or worse.
If the "legal advisor" detects a potential legal problem, ChatGPT will issue a legal disclaimer and a warning, so that it doesn't have to abruptly terminate the conversation. Of course, it can do a lot of other things, such as lowering the temperature, raising the BS detection threshold, etc., to adjust the flow of the conversation.
It can work, and it would be better than a hard-coded filter, wouldn't it?