As a (former) ChatGPT plugins developer, our business has absolutely tanked due to GPTs. Discoverability is nonexistent because search is just buried by spam. At least with plugins, there were only a few hundred to sort through, and most had some unique API they would plug into.
For context, we built ChatOCR- an OCR tool that lets users extract text from photos and PDFs. We made roughly $20k from 39,000 users over 6 months on the plugins catalog
There's a conflict in value proposition here, which is more drastic than the discovery issue. That might help you better contextualize the failure, and avoid similar risks in the future.
Namely, GPT is now a multimodal trained llm capable of doing OCR over PDFs and images. Given the accessibility of that feature, we expect that users do not try to discover OCR plugins anymore- they feel no need.
Out of curiosity, what was the contingency plan in the case that OpenAI did this? What rationales did you use to estimate the likelihood and severity of that risk? Were there good reasons to discount that risk?
We knew the time would come, but we built ChatOCR in a week. If we overthought the time horizon problem, we'd have 0 users and $0.
But also, GPT-4-Vision is multi-modal but does not specifically use OCR. Our tool is used mostly to extract text from documents and load it into context, and we still saw growth after OpenAI built this feature into ChatGPT.
You can't get a large organization to move fast and ship something in a week. So this was either a solo developer en devour, or at max low single digits.
Thank you for sharing your risk thinking here. I like that you are risk hungry, not prone to hesitation, and you achieved revenue on a shoe string budget.
I'm curious about the idea that your OCR-generated context might give better recall than whatever GPT uses to parse PDFs into context. That would support the idea that it is discoverability, and not feature parity, that is killing the product. Can we discuss benchmarks that show conclusively that your solution does better on QA accuracy with respect to a PDF, as compared to GPT4's own PDF mode?
Also I'm curious about the trajectory of active users after the feature dropped from OAI. Was there an inflection point where new active users per month went negative? How long did it take before that inflection happened? Do you still have enough users to keep the service up and gathering revenue, or do you plan to mothball? Do you have any plans to pivot the offering to other markets?
My analysis of the platform risk of the ChatGPT plug-in store depends on
1) risk ideas like Failure Modes and risk hedging. The outcome in the plug-in store is an example of a well known failure mode of platforms that grow very fast.
2) remaining up to date with trends in the llm literature, so we can for instance predict that major vendors would release multimodal llms
3) being aware of trends in tech platform dynamics. Specifically the trend for "platform enshittification" - a term coined by Corey Doctorow - which is the tendency for tech monopolies to exploit their own users and vendors when they can.
I'll talk about each of these a little bit more in depth.
General Risk Analysis
First I recommend reading academic business and cyber risk analysis textbooks. Not just popular business books like "Anti fragile" or "The Black Swan" (which are good), but also academic books that introduced Bayesian networks and Failure Mode and Effect Analysis. You're not going to get away from some math, but thankfully there are techniques that let you draw situations as network diagrams, which I personally find helps with my natural difficulty with mathematics and symbolic processing. You can do pretty decent risk analysis just using "napkin math" - rough graphical and numerical estimates that approximate risks and can sharpen your intuition.
I like to recommend cyber risk analysis too, since it makes sense to be able to accurately assess your organizational cyber risks as an entrepreneur. It shows us how beautiful and interdisciplinary risk management can be, and ideas like adversarial modeling, the attack kill chain, attack sequences, and the attack graph are valuable in ordinary business risk management, like coping with insider threats or adversarial competitors.
Technical Observation
Observing the development of technology is also important, and with AI we have the advantage that most of the development is happening in publicly accessible research papers on arxiv.
By following the AI literature and evaluating LLMs and their capabilities, I made these two observations:
1. Multimodal and OCR assisted LLMs were hot in the AI literature early last year, and that approach was very helpful for reducing the hallucination/bullshitting problem - ai's biggest usability problem to date. Some of these papers were from openai themselves.
From this we could assume that the major AI vendors would work very hard to integrate this feature, and suppose that third party OCR solutions would get kicked out of the market very quickly once they did. We have seen that with GPT, Gemini, Claude, and more.
2. GPT4 continues to beat the pants off of every single competitor, in the hundred+ benchmark papers I've read, in the public benchmarks like Chatbot Arena, and in my own personal and business use case benchmarking. There's basically no contest on quality, and the result is 100% capture of the market of serious users who depend on GPT4 quality and are sophisticated enough to evaluate AI quality.
From this we can assume that OpenAI is well positioned to do everything it can to retain this advantage, even up to compromising developers who think it is safe to use the GPT store to build general improvements for GPT's weaknesses.
To gain this technical insight, I highly recommend reading many AI papers, internalizing their insights, and if possible, working hands on with their experimental techniques. Being aware of AI's current weaknesses and benchmarks, and being able to name papers working on overcoming specific problems, allows us to predict what might be coming next from big vendors.
Platform Risk Analysis
To gain insight into why large corporations tend to hurt their own developer communities and users when they gain the opportunity, I highly recommend Cory Doctorow's 2023 article, "The Enshittification of Tik Tok". It covers a great deal of other examples besides - including Apple and Google's app stores which are relevant to understanding the outcome for the GPT plug in store. And it does a very good analysis of user and vendor exploiting strategies on Tik Tok.
Doctorow also writes a freely available book/audiobook, "The Internet Con: Seizing The Means of Comoutation", where he analyzes the issue and its contributing factors, and proposes a solution based on regulations. While his solution - heavy regulation - doesn't seem that realistic, it's a good analysis, especially for entrepreneurs who might want to build on top of platforms that have this form of risk.
Finally, here's a paper that builds on this idea and analyzes how platforms are able to use their algorithmic control over the users attention to increasingly disadvantage a variety of stakeholders, including users, vendors, and advertisers:
I recommend a variety of sources including literature on risk management itself, literature having to do with the technology itself in a large variety of application domains, and being aware of market trends and forces, like the pressure to take advantage of non-financial stakeholders once a platform reaches a market dominance.
My general AI business strategy is to focus on my sub niche of technical work where AI has not really penetrated - automating 2d spline processing operations that have been ignored by AI researchers due to their focus on point clouds, radiance fields, bitmaps, and meshes, which are more important and applicable to a huge variety of domains. I focus on providing geometric processing services directly to manufacturers dealing with 2d vector data from CMM systems.
The CMM vendor I'm specializing in has been lax in adopting automation technology, and the customers I'm selling to are very reluctant to invest in improved CAD software, even when it could reduce their own labor by a factor of three or more. So between those two gaps I have a niche which I have exploited for 3 years now, and which I expect to exist for at least another 3 years. I plan to turn my CAD automation tools into an attractive product by that time, and also hope to advance the state of the art AI techniques in reverse engineering 2D splines from CMM data.
I hope that lets me hold on to the niche until 2030. But I also would not be very surprised if my niche gets disrupted by one of the major CAD or CMM vendors in that time.
My worst case outcome seems to be that I would have to join a more solidly established corporation as an research manager, strategic analyst, or software engineer. That could be an improvement in my quality of life, given that my current yearly revenues are still under 60k. However I'm going to fight that with all I've got, because I love owning my own company so much! I am an autistic entrepreneur, and owning my own company has been a huge breakthrough in my Independence and ability to avoid burnout. I can get by on as few as 10 hours of labor a week, which would be very hard to find in a corporation.
Anyways, I hope that this message has been beneficial to you even though it grew a bit long. I find Risk Management very fun and rewarding to study and apply, and hope you will as well! I wish you luck in all your endeavors!
> Is OpenAI going to use the Amazon playbook of compete with the more popular products?
They overtly target universal capability for the model itself, it’d be surprising if they didn’t, within that, prioritize functionality that has demonstrated that it is in demand for use with the other existing model functionalities, which successful GPTs or plugins clearly demonstrates.
I've heard it called "Platform risk". Also "Playing in someone else's walled garden", or something along those lines. I realize that's not a term for the inevitable rugpull, but that's the closest I can think of.
That one is a little different. That's when Apple clones your app into their OS as a core feature, thereby completely killing your market.
It usually implies Apple deliberately studied your specific app or replicated it based on details revealed in B2B licensing/acquisition meetings, similar to what MS pulled with Stacker back in the 90s.
Do end users have to pay you each time or buy access or credits or something, or do you just get a cut of ChatGPT paid subscriptions when a paying user uses your plugin?
We use a third party plugins manager called pluginlab.ai. It manages auth and subscriptions for plugins users by prompting them to sign in when they hit a paywall.
One would think that an company creating cutting edge AI tools, would dog-food that AI biggly. Like, using it for discoverability, here, or having a chat-with-docs system, like groq does. But that doesn't seem to be the case. Though they do use it for prompt generation.
I wonder.... The prompt generators can't really go wrong. I mean, they can look better or worse, subjectively. But a QnA bot answering questions about docs can be objectively wrong. Is this calculated to improve the optics, and avoid a lot of hallucination complaints?
Not sure what you mean: OpenAI's App Store doesn't charge and has discoverability. (hence the viability of the spam, hence TFA)
OP is likely affected by the change from plugins to GPTs
Now, users have to manually select an app. Whereas before, you'd have a list of, say, 5 apps and the AI would attempt to intelligently determine which app to use.
OpenAI said plugins didn't find product-market fit, and that's why they moved on.
It's worth opining there's an air of rushing around doing nothing with their product development. Like instead of "what should it be?", and building it, and sustaining it, it's a pastiche of every startup shibboleth you've ever heard - sprints, if 10-20% of the user base / 10^7 users aren't using it within 3-5 months, it's time to Pivot (throw away the working solution we invested in)
That makes sense when you're at 100 users in an uncertain market but at their scale, it reminds me more of how Google ended up with the reputation it has.
* forgive me for not air-quoting app and app store, these aren't __apps__ per se, but it was immensely distracting, with an air of condescention, when I air-quoted everything
> Not sure what you mean: OpenAI's App Store doesn't charge and has discoverability. (hence the viability of the spam, hence TFA)
I think the point was that the app store charge acts as an effective filter for spam, enough so as to determine the sustainability of the app store ecosystem.
I could very well be mistaken though, that was just how I interpreted it.
Almost immediately after OpenAI opened its API there were a raft of services to send auto-generated sales and marketing emails. Sure, there are other ways to generate unsolicited commercial emails. There are even services that will spam on your behalf. But, there's obviously a huge difference between an empty text editor and a service that generates plausibly phrased emails with virtually no effort.
Similarly, the worst kind of SEO grifters are now providing services which can mass-generate thousands of longform "articles" on your behalf based on nothing but a handful of keywords, or even scrape a competitors site and bulk generate articles based on their content. SEO spam isn't new but automating it on this level is unprecedented.
A recent example of this was "theresawikiforthat", which generated sites like ocamlwiki.com and juliawiki.com that were full of very low-quality information. They are offline now, but the ocamlwiki was consistently toward the top of google search results for a while last year.
I would like to write a bot for these kinds of websites, but subvert the whole thing, have no ads, no waffle, high quality info. Get to the point as fast as possible.
The danger is that we rely almost entirely on Google to solve this.
The monopoly they have as a search engine means there is effectively sole responsibility on them to maintain a usable internet.
I genuinely feel sorry for the people that now have to figure out how search engines work in a world where anybody can generate reams of passable (albeit not unique) content in minutes.
While the Google Search people are scrambling to figure out how to sift through genAI slop, another team on the other side of the Google campus is building a platform for publishers to generate that slop. By joining the AI arms race they've put themselves in the awkward position of contributing to the problem that's killing their own core product.
Yup I saw the Google official YouTube channel comment on a seo channel saying these are good tips and the tips were basically let AI generate a bunch of content for you.
Google is out of ideas and they are lost as a company. They will take a long time to die out but die out they will.
Like all automation tools, LLMs are very effective at getting more done.
They're not content-neutral (no violence, no sex, etc., which some people complain about very loudly), but for anything not forbidden, gpt-3.5-turbo-0125 can output more for about $4233.60 than a human record holding stenographer can type in a working lifetime.
When it doesn't have to be amazing, and I think spam is one of many things in that category, this is absolutely a huge deal compared to a text editor.
Similarly, the Haber–Bosch process is no more a tool to produce crop fertilizer than feeding corn to a cow is.
Seriously though - the ability to industrialize and mass-produce that both LLMs and the industrial ammonia process have given us, for the bad and the good, is the real game-changer here. Treating two tools the same because both can give you the same singular result is an error if they have dramatically different - by several orders of magniture - speeds at which these results are achieved.
> These findings, combined with earlier results on synthetic imagery, audio, and video, imply that technologies are reducing the cost of generating fake content and waging disinformation campaigns.
That paragraph comes from this [1] now-deleted blog post from OpenAI in 2019 when they decided not to release GPT-2 due to "concerns about malicious applications of the technology". It's hard to argue that GPT and friends are not "tools to generate spam" when the researchers themselves argued that point years ago.
I think the majority of explosives are used for fireworks and mining, but I honestly have no idea how to get an accurate count of explosives set off in mining vs. explosives set off in warfare.
I'm pretty sure that majority of explosives by equivalent TNT or by dollar value are used in nuclear warheads for strategic deterrence, and thankfully have never been set off. But it would an interesting challenge to estimate all of these indeed.
I think you’ll appreciate the origins of the Nobel Prize.
Alfred Nobel was born on 21 October 1833 in Stockholm, Sweden, into a family of engineers.[11] He was a chemist, engineer, and inventor. In 1894, Nobel purchased the Bofors iron and steel mill, which he made into a major armaments manufacturer. Nobel also invented ballistite. This invention was a precursor to many smokeless military explosives, especially the British smokeless powder cordite. As a consequence of his patent claims, Nobel was eventually involved in a patent infringement lawsuit over cordite. Nobel amassed a fortune during his lifetime, with most of his wealth coming from his 355 inventions, of which dynamite is the most famous.[12]
There is a popular story about how, in 1888, Nobel was astonished to read his own obituary, titled "The Merchant of Death Is Dead", in a French newspaper. It was Alfred's brother Ludvig who had died; the obituary was eight years premature. The article disconcerted Nobel and made him apprehensive about how he would be remembered. This inspired him to change his will.[13] Historians have been unable to verify this story and some dismiss the story as a myth.[14][15] On 10 December 1896, Alfred Nobel died in his villa in San Remo, Italy, from a cerebral haemorrhage. He was 63 years old.[16]
Yes, they ARE more a tool to generate spam than text, because they can do everything a single text does, but in addition it can also generate text. So strictly more spam.
It was a huge mistake to monetize GPTs. I and many other AI enthusiasts would have probably developed plenty of useful GPTs for free. But now I would never bother because nobody will see my projects under all the SEO spam and fake ratings.
That and even if they had proper reviews in place, there is no telling when they would pull the plug on the entire thing. Based on that alone I never even considered making a GPT for sale.
Having said that, I do like them because of the openAPI integration. I made one for myself that can send things to a private discord channel through webhooks and can calls my memo instance (https://github.com/usememo) to store or retrieve memos.
When you say it's monetized, are they actually giving anyone any money? The details of that were to "to be announced" but I haven't heard any announcement.
It seems that just the 'threat' of monetizing it would cause an absolute flood of GPTs to be launched, under the assumption that they would eventually get paid.
It's the fastest growth product in number of users in the history of the internet, to think they wouldn't run the Apple store playbook as much as possible is naive.
Reminds me of an early rumor back when Sam Altman was fired (and briefly re-hired) that the old board allegedly was very much against the GPT store.
I kind of get why now. It arguably doesn't really aid in the non-profit's mission to pursue AGI, or share the benefits of AI with humanity, or any of that stuff. Doesn't even seem to be much of a worthwhile endeavor from a purely money-making perspective either.
Yeah, but does it matter? Most of their revenue comes from enterprise clients and it seems more likely that someone is going to create a killer app for AI by leveraging APIs and publishing their work on traditional distribution channels rather than creating something that only works on a relatively niche platform.
Like, hypothetically, if Anthropic took the lead in this effort, I don't think it would derail OpenAI's plans in the slightest.
A spam-filled online marketplace is almost becoming a feature. That way the top developers have to pay and outbid each other to get featured and show up in search results, and that becomes a lucrative revenue source for the platform.
As any company should. Not sure what's funny about it - I'm interested in the person. If using AI is one of required capabilities I will ask them to perform a task using it during the interview so I can review their process. In the same way I don't expect to receive an executable binary as a cover letter for a SWE position.
I had many candidates submit absolutely terrible cover letters clearly made by ChatGPT. The number one sign - it talks about their experience exactly in the order I listed it on the job post; number two - I don't get to know anything about the person. I asked them to submit a cover letter they wrote themselves, and while not everyone's writing was as good as ChatGPT can write, I got to know the person, and hired few of these who definitely wouldn't make the cut if I went by the AI made letter.
If you can have AI generate your letter so well I can't tell, you get a pass. Nobody was able to do that yet, though (or at least didn't admit to it, even though I'm asking after they are hired and saying that at this point it would only help them if they did so).
> The number one sign - it talks about their experience exactly in the order I listed it on the job post
I wouldn't attribute that to AI; it's a very natural way to write a cover letter. You want to hit the bullet points in the job posting, and the most logical way to do that is to go down the list and check boxes.
I've never seen that from an LLM; usually the error they make is I've asked them for too many things (a quantity which varies wildly), I get a response for about 3 of those things and a note saying something to the effect of "put more here".
A very common piece of advice is to include keywords in your resume in exactly the way they are typed in the advertisement. This is because of places that use ATS systems to filter down the "best" candidates based on how closely their skill set matches the requirements. For example, if the requirement says "Must have proficiency in the Python scientific computing ecosystem, including TensorFlow and Scikit-Learn," if you subscribe to the keyword methodology, you'd make sure to sprinkle in the exact phrase "Python scientific computing ecosystem" on the off chance that some non-domain-expert in HR copied and pasted into their ATS.
I can see how someone would take that advice and include exact phrases in their cover letter in an attempt to get through the filters.
most companies are using AI to filter out CVs these days, and cover letters written with ChatGPT seem to have a higher success ratio. I wouldn't blame anyone for using it, unfortunately.
Maybe in first rounds. I am the third/fourth round interviewer. If a cover letter like that gets to me, it's automatically a no - not because of the fact itself but simply because the cover letters are terrible.
Don’t you see how the incentives become impossible? The AI stage needs you to be as obvious and direct about your experience as possible, the human stage gets irritated when you write like the reader doesn’t have a brain.
Personally I send my resumes through a gpt with the job posting and ask if I’d be a good fit. Almost always the GPT will initially say no because I’m using some terminology that anyone in the industry would understand is a form of the posting’s requirements, but the GPT does not. But then perhaps the screening recruiter doesn’t either. So why not be specific? But then anyone else at the company might think I’m a moron… ugh.
I suspect it depends on the field, but in my experience it’s often a good indicator of success. A shoddy letter betrays a lack of professionalism. If you’re not going to spend 10 minutes running a spell checker it means that either it’s not important to you, so good bye, or that your standards are really low and you will keep producing garbage if I hire you, so good bye as well.
The problem with cover letters is false positive (people who did not write the letter themselves, or who did it to a much higher standard than their usual). But then, that’s what interviews are for.
I did hire a couple of people with either a subpar CV (some good people sometimes end up in dead ends or difficult situations), or cover letter (not everyone is a great writer), or reference (your issues with your former boss are not always your fault), or interview (you can have a bad day), so I would not rely on a single factor. But a combination of 2 dodgy elements is an automatic rejection. Each one tends to surface different aspects.
Consider that job seekers today sometimes have to apply to hundreds of positions before being hired. For most of those applications, they won’t hear anything back. Given that dynamic, would you spend time polishing a cover letter for each application?
In my field, certainly. We expect them to prepare a presentation for the interview as well. Our pond is not large enough, there are not hundreds of positions to apply to.
I guess my thing is that I don’t want to jump through hoops, I want to write code. You know how some people are enamored of those little ‘shit tests’ for relationships, they think they’re really clever? I have no patience for that.
Show me who you are and I’ll show you who I am, and if we like eachother, then let’s try to get together and see what we could build together. if the entire interview process is a series of hoops that you dictate I jump through to even have a chance with you, then why should I expect my future with you to be any different? I don’t want that.
I don’t want to write a carefully crafted letter pretending that I want anything other than respect and money and a chance to do work on interesting from a potential employer. I just want to write code. The rest of it is extensive shit testing in my book and I have no patience for that. I don’t put up with that kind of treatment from anyone else - what makes you think dangling money in front of me would change my mind about that?
You have a CV, three rounds of notes, usually a linkedin profile, and you’re arrogant enough to dismiss people because they had to put up a crappy cover letter to get past your stupid ATS, and you still don’t see how shameful your behavior is.
Lol. If all you can give me is a crap letter that says nothing about you and just lists what I wrote in the ad and how excellent you are at it, then I'm going to go with the people who excitedly told me about their actual experience and related interests. Don't see anything shameful about that.
CV is nice except that most people have a list of "Senior Software Engineer at XYZ" and that's it. Doesn't tell me anything, so again, if that's what you're going with, I'm going to prefer the person who took the time to actually talk about their related experience and interests. LinkedIn is nice but same problem as with CVs.
Seems like you think everyone has perfect detailed CV and LinkedIn profiles... Nope. It's mostly crap that says nothing. If your CV/LinkedIn can take the role of the cover letter then sure, no problem, but I don't see that often.
Not sure how the recruiter's notes about your nice friendly outgoing personality and good English level would help me understand your experience and interests related to the job, but yeah sure lol.
No crappy ATS at our company, BTW. Applications go to the inbox and are handled by real people.
I've been trying to get Gemini to write things for me. It's either way to formal, passive, and evasive, but if you ask it to be less formal its like "Yo dude, give me a job!".
I find it quite hard find a middle ground and write in a style I like.
Paid internships are common, and LLMs are so much cheaper than humans they might as well be free.
The prices on OpenAI's website are listed per million tokens, the cheaper model can read a typical book for less than the cost of that book in paperback, second hand, during a closing down discount sale, and even the fancy model is still in range of a cheap second-hand paperback (just not during a closing-down sale) for reading and just about the price of a new cheap paperback for writing — cheap enough it might as well be free.
Plus, they're an intern at everything all at once. You don't get to hire someone who has a late-degree-level grasp of programming (in every major programming language at the same time) and genetics and law and physics and medicine and who has a business-level grasp of all the most common written languages on Earth — there is no such human.
(If and when we can make an AI, be it an LLM or otherwise, which can act at the level of a senior in whichever field it was trained on, even if that training takes a gigawatt-year to compute per field, and even if inference costs 100 kW just to reach human output speed, it's going to seriously mess up the whole world economy; right now it's mainly messing up the economic incentives to hire people fresh from university while boosting the amount of simple stuff that actually gets done because normal people and not just business can now also afford to hire the weird intern).
> and might even grow to good hires.
Even if they do, will you be the one to hire them?
Also, ChatGPT has so far existed for about as long as I spent on my (paid) year in industry that formed part of my degree. It wasn't called an internship, but it basically was. In that time, the models have grown and improved significantly, suggesting (but not proving, because induction doesn't do that) that they will continue to grow and improve.
it is impossible to divorce AI from spam. the two go hand in hand. the spam situation will get far far worse than ever imagined with AI. The web will be increasingly clogged with AI-generated content (e.g. videos and articles ) that can do a reasonably good job impersonating human-generated content to the undiscriminating reader. Same for AI-produced research papers, plagiarism, etc. AI-generated content can still be identified, but that is not the point. It has to only be good enough to pass muster for the average person or anyone who is not paying close attention.
One area that's getting close to hitting the mainstream is AI agents that control web browsers or desktop apps. Basically, you tell your AI what you want to do, and it does the computer work for you, showing the results the way you like. You won't have to dig through tons of useless apps anymore. It'll search, sort, and create feeds for you. This is what scares advertisers the most - when we start saying, "Just talk to my AI." God knows I'd love an AI that can fill tax forms for me.
Operating on top of UI it is compatible with all devices, kind of like an android robot would fit well with all human spaces and tools. There is plenty of screen capture with narration to learn from. You can generate more training data by essentially giving free reign over a VM to an AI agent. Ask it to do office work. We can also collect this kind of data very easily.
> Basically, you tell your AI what you want to do, and it does the computer work for you, showing the results the way you like. You won't have to dig through tons of useless apps anymore. It'll search, sort, and create feeds for you. This is what scares advertisers the most
Not just advertisers. There isn't much point in creating any sort of fun content on the web if it never actually gets seen by human eyes, but instead is just regurgitated into a slurry.
Funny, I like to play a LLM game - copy paste a whole conversation thread into a LLM and ask "Write a 500 word article based on this material". The result is well grounded enjoyable "regurgitated slurry".
It's a match made in heaven - people's opinions debating stuff and LLMs ability to phrase that like professional written articles. Grounding and style.
I'm afraid in the future not even our social network posts will be directly read by other people. Everything will pass through an AI extractor-reformatter.
As an experiment, copy paste a few pages of comments into a LLM and see the quality of its outputs. I often grok a new angle or make a new connection after reading the AI text.
Here's a sample, what it understood from our thread: "The discussion touches on concerns around AI interfering with traditional content creation, advertising models, and how we discover and consume information online as AI language understanding and generation capabilities advance." Pretty spot on framing.
> Here's a sample, what it understood from our thread: "The discussion touches on concerns around AI interfering with traditional content creation, advertising models, and how we discover and consume information online as AI language understanding and generation capabilities advance." Pretty spot on framing.
Could you clarify what value, exactly, you're finding in that series of words?
> God knows I'd love an AI that can fill tax forms for me.
Why? A good accountant can do it for an insignificant[1] annual fee, and if he screws up, you know where to find him.
Or if you're poor enough to really need to save a hundred bucks once a year, you probably qualify for freefile, probably have a really simple tax situation (1 or more W2s, end of story), and can file really quickly.
Have you used an e-file program lately? It, much like a human accountant, asks a bunch of questions. Not sure what an AI could do differently, except fuck up and ask the wrong questions confidently once in a while.
The US has some screwed up tax filing practices but throwing more magic at them would only make it worse. If AI does my taxes, who do I get mad at when it files it wrong and the IRS comes after me? Why are we even talking about AI to apply to a problem that really isn't particularly messy and is already easily solved by good data import and by a nest of ifs and elses maintained once a year by a team of programmers who can read tax laws?
Like, we could make tax filing unnecessary tomorrow for most people with no AI, just by passing a law that allows the IRS to use the imported W2 data they already receive to compute and file default tax returns for most people who have simple tax situations.
I recall imagining something s quite like this as a kid when I first encountered the term "User Agent".
I'd be all in on this if I knew I had full control and explainability of its decisions, but am afraid that reality will be a lot more depressing than what I imagine.
For a very immediate example, I can already see a clear path to somebody coming up with an LLM service that will automatically filter out political spam texts and emails. (Maybe it already exists and I just haven't seen it yet.)
On a related note, in 2002, Paul Graham arguably invented (IIRC) the spam filtering that is now commonly in use. While email spam still exists today, it was way, way, way, way worse before Gmail introduced bayesian spam filtering around that time. Today it's an occasional amusement. In those last days before spam filtering got a lot better it was reaching the point of seriously making it difficult to use email because the majority of your inbox was spam.
The crazy part is it's really just this simple bayesian approach that solved like 95% of the problem. Simple math that already existed. Stuff that can be explained to a high schooler. No new crazy ML/AI, none of that. Just someone stepping back and pointing out "hey, I think we can use this tool here...."
There was a lot more lower hanging fruit in those days.
Cool! :)
And thanks! As you can guess I still very pleasantly remember when the feature hit and the spam issue got so much better.
I always wonder which pieces of my code from my career I'm proud of are still running somewhere. Unfortunately, it's mostly been proprietary stuff so there's really no telling.
seems like that would be very straightforward to the point that i could probably do it but maybe not in the most elegant way. I would be surprised if there wasn't a model on hugging face trained to identify the political leaning of some text. I bet it would be small enough to run locally, then it's a matter of creating a plugin/module or something that you add to your email client that runs on every new mail. Based on the result, move the email to your political spam folder.
LLM is just quite compute-intensive. You could send emails to the OpenAI API to classify them as SPAM or HAM, but you'd probably cry at the bill. That should explain why it doesn't exist in free-tier accounts. Worry not, even in the event you belong to the GPU-poor, a beefy CPU should allow you to run a small LLM on your local machine. And if you find a good dataset for it, you can train the model in a Google Collab.
You can configure Postfix to filter content based on a shell script. https://www.postfix.org/FILTER_README.html You can turn the email into plain text from your shell script using links and then ask Mistral if it's spam or political: https://justine.lol/oneliners/#url You can use the grammar flag to force Mistral to only provide a YES or NO answer, which is easy to check from your script using an IF statement. Then boom, no more spam or politics.
I recently did a Kaggle experiment and finetuned a LLM to classify SMS as SPAM or HAM. It was not so difficult; following the text classification example in the Hugginface transformers docs was enough. Without trying hard at all, accuracy was above 90%. Impressive even though even the messages marked as HAM seemed to be quite trash. The model itself was small - few hundred million parameters only. Models of this size run well on CPUs and an even smaller might have workd too. And non-neural classifiers might also do fine.
You could do something similar and unleash it on your mailbox. The tricky part are integrations. At work, I don't even bother since interacting with the email system there is way more tricky than with GMail.
The Apple App Store (and Google Play Store) have a fairly large barrier to entry. To develop an app, you needed to actually code one. Learning objective C, Swift or Java isn't easy and therefore you had to have some skill to gain entry. With the GPT store, anyone that could type out some instructions could build a GPT and submit it. There was 0 barrier to entry and the flood of spam was a foregone conclusion. They should have either vetted the GPTs, invited select developers, or charged a large developer fee to be included to increase the quality and reduce the spam.
And you have to pay and there's a review process, and basically anyone who has released apps has had their submission rejected, the reviewers generously deny.
As I had written about publicly already, this doesn't surprise me a bit. Same with poe.com, GPTs can be generated in just a few clicks by anyone anywhere with no quality control in sight.
My guess is that it was always a play about having access to high-quality data.
> OpenAI’s terms explicitly prohibit developers from building GPTs that promote academic dishonesty. Yet the GPT Store is filled with GPTs suggesting they can bypass AI content detectors, including detectors sold to educators through plagiarism scanning platforms.
The use of generative AI in academic contexts--both education and research--is turning out to be a vast ethical gray area. It seems that some people regard any use of AI for writing under one's own name to be dishonest. Others are willing to allow some AI use, but where they draw the line varies a lot. Is it okay, for example, to have an LLM correct word-level grammatical mistakes, rewrite a paragraph to make the author's point clearer, write a paragraph based on the author's bullet points, write a full first draft that the author then checks and revises by hand, translate phrases or sentences that the author has written in their native language, translate an entire paragraph or paper that the author wrote, etc.?
Over the past year, I have conducted several workshops on LLMs for university faculty in Japan, and when I have polled them about the acceptability of each of those use cases, their responses have varied widely.
At the institutional level, some universities discourage their students from using AI for writing at all, while others seem to be encouraging it. Just last week, I heard about two public universities in Japan that have contracted with Microsoft to provide Copilot to all of their students starting in April, when the new academic year begins here.
I use GPTs like bookmarks. If I find an interesting or useful prompt, I bookmark / GPT it, maybe add a tool, later, and sometimes I share it, too. For me, Custom GPT is just a flashy term for bookmark and the GPT Store is just a social bookmark site.
It was doomed from the start, because monetizable computing platforms that target consumers can't succeed if they don't have high barriers to entry.
Low quality Atari games were responsible for the 1983 video game crash. Anyone could publish an Atari game. So what ended up happening was people made games about things like raping Native American women (Custer's Revenge) and they were sold just like any other Atari cartridge.
Nintendo pioneered the closed platform. No one was allowed to build games for Nintendo without Nintendo's permission. Microsoft and Apple did things a bit differently, by making their platforms as difficult to develop for as possible. WIN32 is byzantine. Making iPhone apps when the App Store was launched required learning Objective C, which was an entirely new language to most developers at the time.
Every GPT I've tried has been GPT-4 plus or minus an imperceptible difference in quality. It's like Atari publishing a game, and thousands of clones show up that are functionally the same, and zero other original games.
They should scrap this failed idea and just give us MedQA, AlphaCode, etc. Specialized GPTs based on custom RAG or symbolic reasoning overlay or whatever, that actually do something that base GPT-4 can't do, in the sense of having an unmistakable step-change in quality for one type of task.
> So what really happened? Most likely little quality control, garbage copy cats running around and now there is an insurmountable amount of spam.
That, and there isn't a lot of space for real innovation within the "GPT" framework that OpenAI provides. Most of what's available in their store is literally just prepackaged prompts, and there's only so much that can accomplish. You can ask a chatbot to pretend to be an expert in a topic all day, but that won't actually make it into one.
If Apple's App Store were just a form you filled out to release versions of Flappy Bird with new graphics or whatever, that'd fail too. Developers need conceptual "space" to innovate; I'm not convinced that's possible when the interface (a chat window) and the back end (a language model) are both set in stone by the platform.
I think it has a fundamentally different problem. Any GPT I have tried, I could just ask chatGPT4 the same thing so I haven't bothered exploring much.
I am still waiting to hear of a GPT that is obviously better than chatGPT4 by itself. Maybe it exists but I haven't heard of it.
Short of GPT5 powered GPTs while I can only still talk directly to GPT4, I don't think I am going to be that impressed with anything using the chatGPT API compared to just asking chatGPT4 directly or using the API for my own needs. I don't need an interface for what I want to do with the API. ChatGPT4 can help me write that myself quite easily. I think the future looks much more like this than some app store knockoff.
Shouldn't OpenAI be able to easily remove these non-complying GPTs by putting up their own GPT for moderation with a set of rules?
They should already have something in place. And if they have, and these GPTs are still up it can also mean these are OK as per their policy. Ot may be they are deliberately not moderating these because money.
For context, we built ChatOCR- an OCR tool that lets users extract text from photos and PDFs. We made roughly $20k from 39,000 users over 6 months on the plugins catalog