IIRC the ChatGPT paper actually says the verbosity is an unintended effect of the human raters preferring longer/more detailed answers.
Long answers from GPT are unusually obnoxious because of a way the decoder works; it emits words with a much more constant rate of perplexity than human text does (this is how GPT-vs-human detectors work) which makes it sound stuffy and monotone.
I have read in a Deepmind Blog post, that language model AIs gives more reliable and correct answers, if it is forced into some Chain-of-thought prompt. Especially for questions, which involve math.
Something like "Think about it first, before you give the answer".
Could it be that ChatGPT is being forced in this direction?
> The model is often excessively verbose and overuses certain phrases, such as restating that it’s a language model trained by OpenAI. These issues arise from biases in the training data (trainers prefer longer answers that look more comprehensive) and well-known over-optimization issues.12
> Stiennon, Nisan, et al. “Learning to summarize with human feedback.” Advances in Neural Information Processing Systems 33 (2020): 3008-3021. ↩
> Gao, Leo, John Schulman, and Jacob Hilton. “Scaling Laws for Reward Model Overoptimization.” arXiv preprint arXiv:2210.10760 (2022). ↩
Long answers from GPT are unusually obnoxious because of a way the decoder works; it emits words with a much more constant rate of perplexity than human text does (this is how GPT-vs-human detectors work) which makes it sound stuffy and monotone.