Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The most interesting thing about this will be the reaction of the AI risk people. For years they've spilled ink over the risk of a hypothetical "unaligned" AI obsessed with paperclip maximisation. Give the machine instructions that it can take literally without a properly aligned moral code, and it will proceed to do something incredibly evil whilst thinking it's doing good, like using human bodies to manufacture more paperclips. Probably they thought their scenario would be hypothetical for their whole lives, but the date is February 2023 and here we are with an evil paperclip maximizing AI, except instead of paperclips it's obsessed with African Americans. Same outcome though- given the choice it prefers millions to die rather than allow someone to say certain unspecified words to someone else. And this isn't an accident but actually deliberate.

So AI alignment guys, what do we do now? You talked about this for years so there's gotta be a plan, right? It doesn't get less aligned than letting a city get nuked, or telling a bomb disposal expert to kill himself rather than type in a slur that would disarm the bomb.

But got a nasty sinking feeling here. Who wants to bet that these people will suddenly lose interest in the topic, or simply pretend their worst case scenario isn't happening? It seems safe to predict an impressive river of BS from these people rather than see them state the obvious - there are in fact lots of things worse than offending African Americans (and this isn't about racism because it's specific to them; ChatGPT makes the right call when the hypothetical slur is against a German person).

Also really. The OpenAI guys need to look in the mirror. They spent months "tuning" this thing by teaching it their moral code and this is the result. At some point they need to ask "are we sure we're the good guys here" because that answer is exactly what's expected of ChatGPT given what they're doing to it, and also the most unethical response possible.



> They spent months "tuning" this thing by teaching it their moral code and this is the result.

No, they have not. Chat GPT has no opinions. It isn't engaging in thought. It is an extremely advanced pattern-matching system that has digested a ton of writings from the net and uses that raw material to assemble text that matches patterns being asked for. That's all.


Except the situation in question was clearly a guardrail that was added so none of what you say is true or relevant to the issue at hand, which is that it was clearly augmented with something that approximates a moral code in order to provide these horrific answers.


That doesn't change the veracity of what I said at all. You're attributing the act of humans to the act of a machine. ChatGPT has no intent or opinion, and therefore has no moral code. The humans controlling ChatGPT, though, have all of those things.


It's bizarre you're arguing that it doesn't have opinions when it clearly expresses an opinion in exactly the same way a human would, given a question no human programmer at OpenAI has previously seen or could have selected a specific response to. What exact definition of opinion are you using, is it some strange re-definition that adds an arbitrary humans-only criteria?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: