The article you linked says that GPT4 performed better than crowdsourced workers...

famouswaffles · on April 22, 2023

Fair on the wording I suppose but

First of all, the dataset used for evaluation was created by those researchers, weighing it in their favor.

Second, GPT-4 still performs better in 6 of those. Hardly 1 or 2. And when it doesn't, it's usually very close.

All of this is to say that GPT-4 will smoke any bespoke NLP model/API which is the main point.