Named entity recognition - extracting structured data from text

rolisz · on April 22, 2023

But there are fairly good models for doing NER that are not LLMs. Models that are open source and you can even run on a CPU, with parameter counts in the hundred of millions, not billions.

famouswaffles · on April 22, 2023

GPT-4 generally performs better than expert human workers on NLP tasks, nevermind bespoke models. https://www.artisana.ai/articles/gpt-4-outperforms-elite-cro....

rolisz · on April 22, 2023

The article you linked says that GPT4 performed better than crowdsourced workers, not than experts. The experts performed better than GPT4 in all but 1 or 2 cases. And in my experience with Mechanical Turk, the workers from MT are often barely better than random chance.

famouswaffles · on April 22, 2023

Fair on the wording I suppose but

First of all, the dataset used for evaluation was created by those researchers, weighing it in their favor.

Second, GPT-4 still performs better in 6 of those. Hardly 1 or 2. And when it doesn't, it's usually very close.

All of this is to say that GPT-4 will smoke any bespoke NLP model/API which is the main point.

billythemaniam · on April 22, 2023

While true, GPT-4 kinda just gets a lot of the classic NLP tasks, such as NER, right with zero fine-tuning or minimal prompt engineering (or whatever you want to call it). I haven't done an extensive study, but I do NLP daily as part of my current job. I often reach for GPT-4 now, and so far it does a better job than any other pretrained models or ones I've trained/fine-tuned, at least for data I work on.

rolisz · on April 22, 2023

But what about cost? There was a recent article saying that Doordash makes 40 billion predictions per day, which would result in 40 million dollars per day if using GPT4.

Sure, GPT4 is great for experimenting with and I often try it out, but at the end of the day, for deploying a widely used model, the cost benefit analysis will favor bespoke models a lot of the time.

travisjungroth · on April 23, 2023

It’s not hard to find a case where GPT4 is a bad fit.