More

consumer451 · 2026-01-24T05:43:01 1769233381

This is a fun thought experiment. I believe that we are now at the $5 Uber (2014) phase of LLMs. Where will it go from here?

How much will a synthetic mid-level dev (Opus 4.5) cost in 2028, after the VC subsidies are gone? I would imagine as much as possible? Dynamic pricing?

Will the SOTA model labs even sell API keys to anyone other than partners/whales? Why even that? They are the personalized app devs and hosts!

Man, this is the golden age of building. Not everyone can do it yet, and every project you can imagine is greatly subsidized. How long will that last?

tern · 2026-01-24T05:57:09 1769234229

While I remember $5 Ubers fondly, I think this situation is significantly more complex:

- Models will get cheaper, maybe way cheaper

- Model harnesses will get more complex, maybe way more complex

- Local models may become competitive

- Capital-backed access to more tokens may become absurdly advantaged, or not

The only thing I think you can count on is that more money buys more tokens, so the more money you have, the more power you will have ... as always.

But whether some version of the current subsidy, which levels the playing field, will persist seems really hard to model.

All I can say is, the bad scenarios I can imagine are pretty bad indeed—much worse than that it's now cheaper for me to own a car, while it wasn't 10 years ago.

depr · 2026-01-24T10:56:38 1769252198

If the electric grid cannot keep up with the additional demand, inference may not get cheaper. The cost of electricity would go up for LLM providers, and VCs would have to subsidize them more until the price of electricity goes down, which may take longer than they can wait, if they have been expecting LLM's to replace many more workers within the next few years.

andai · 2026-01-24T06:02:37 1769234557

The real question is how long it'll take for Z.ai to clone it at 80% quality and offer it at cost. The answer appears to be "like 3 months".

consumer451 · 2026-01-24T06:16:16 1769235376

This is a super interesting dynamic! The CCP is really good at subsidizing and flooding global markets, but in the end, it takes power to generate tokens.

In my Uber comparison, it was physical hardware on location... taxis, but this is not the case with token delivery.

This is such a complex situation in that regard, however, once the market settles and monopolies are created, eventually the price will be what market can bear. Will that actually create an increase in gross planet product, or will the SOTA token providers just eat up the existing gross planet product, with no increase?

I suppose whoever has the cheapest electricity will win this race to the bottom? But... will that ever increase global product?

___

Upon reflection, the comment above was likely influenced by this truly amazing quote from Satya Nadella's interview on the Dwarkesh podcast. This might be one of the most enlightened things that I have ever heard in regard to modern times:

> Us self-claiming some AGI milestone, that's just nonsensical benchmark hacking to me. The real benchmark is: the world growing at 10%.

https://www.dwarkesh.com/p/satya-nadella#:~:text=Us%20self%2...

YetAnotherNick · 2026-01-24T13:15:26 1769260526

With optimizations and new hardware, power is almost a negligible cost that $5/month would be sufficient for all users, contrary to people's belief. You can get 5.5M tokens/s/MW[1] for kimi k2(=20M/KWH=181M tokens/$) which is 400x cheaper than current pricing even if you exclude architecture/model improvements. The thing is currently Nvidia is swallowing up a massive revenue which China could possible solve by investing in R and D.

[1]: https://developer-blogs.nvidia.com/wp-content/uploads/2026/0...

FuckButtons · 2026-01-24T07:53:43 1769241223

I can run Minimax-m2.1 on my m4 MacBook Pro at ~26 tokens/second. It’s not opus, but it can definitely do useful work when kept on a tight leash. If models improve at anything like the rate we have seen over the last 2 years I would imagine something as good as opus 4.5 will run on similarly specced new hardware by then.

consumer451 · 2026-01-24T08:27:38 1769243258

I appreciate this, however, as a ChatGPT, Claude.ai, Claude Code, and Windsurf user... who has tried nearly every single variation of Claude, GPT, and Gemini in those harnesses, and has tested all the those models via API for LLM integrations into my own apps... I just want SOTA, 99% of the time, for myself, and my users.

I have never seen a use case where a "lower" model was useful, for me, and especially my users.

I am about to get almost the exact MacBook that you have, but I still don't want to inflict non-SOTA models on my code, or my users.

This is not a judgement against you, or the downloadable weights, I just don't know when it would be appropriate to use those models.

BTW, I very much wish that I could run Opus 4.5 locally. The best that I can do for my users is the Azure agreement that they will not train on their data. I also have that setting set on my claude.ai sub, but I trust them far less.

Disclaimer: No model is even close to Opus 4.5 for agentic tasks. In my own apps, I process a lot of text/complex context and I use Azure GPT 4.1 for limited llm tasks... but for my "chat with the data" UX, Opus 4.5 all day long. It has tested so superior.

barrenko · 2026-01-24T09:19:23 1769246363

Is Azure's pricing competitive on openAI's offerings through the api? Thanks!

consumer451 · 2026-01-24T09:38:34 1769247514

The last I checked, it is exactly equivalent per token to direct OpenAI model inference.

The one thing I wish for is that Azure Opus 4.5 had json structured output. Last I checked that was in "beta" and only allowed via direct Anthropic API. However, after many thousands of Opus 4.5 Azure API calls with the correct system and user prompts, not even one API call has returned invalid json.

EnPissant · 2026-01-24T11:09:15 1769252955

I'm guessing that's ~26 decode tokens/s for 2-bit or 3-bit quantized Minimax-m2.1 at 0 context, and it only gets worse as the context grows.

I'm also sure your prefill is slow enough to make the model mostly unusable, even at smallish context windows, but entirely at mid to large context.

consumer451 · 2026-01-24T04:23:22 1769228602

In my comment history can be found a comment much like yours.

Then Opus 4.5 was released. I had already had my CC cluade.md, and Windsurf global rules + workspace rules set up. Also, my main money making project is React/Vite/Refine.dev/antd/Supabase... known patterns.

My point is that given all that, I can now deploy amazing features that "just work," and have excellent ux in a single prompt. I still review all commits, but they are now 95% correct on front end, and ~75% correct on Postgres migrations.

Is it magic? Yes. What's worse is that I believe Dario. In a year or so, many people will just create their own Loom or Monday.com equivalent apps with a one page request. Will it be production ready? No. Will it have all the features that everyone wants? No. But it will do that they want, which is 5% of most SaaS feature sets. That will kill at least 10% of basic SaaS.

If Sonnet 3.5 (~Nov 2024) to Opus 4.5 (Nov 2025) progress is a thing, then we are slightly fucked.

"May you live in interesting times" - turns out to be a curse. I had no idea. I really thought it was a blessing all this time.

consumer451 · 2026-01-23T17:29:55 1769189395

Yeah, same here. Also, I can't recall a time since back when I used to make music that I got actually jealous of someone else's abilities, but here I am. :)

consumer451 · 2026-01-23T13:29:43 1769174983

Dumb question:

I keep seeing posts about how ~"the volume of AI scrapers is making hosting untenable."

There must a ton of new full-web datasets out there, right?

What are the major hurdles that prevent the owners of these datasets from providing them to third parties via API? Is it the quality of SERP, or staleness? Otherwise, this seems like a potentially lucrative pivot/side hustle?

senko · 2026-01-24T18:56:06 1769280966

> There must a ton of new full-web datasets out there, right?

Sadly, no. There's CommonCrawl (https://commoncrawl.org/) which still, sadly, far removed from "full-web dataset."

So everyone runs their own search instead, hammering the sites, going into gray areas (you either ignore robots.txt or your results suck), etc. It's a tragedy of the commons that keeps Google entrenched: https://senkorasic.com/articles/ai-scraper-tragedy-commons

Terretta · 2026-01-23T17:39:24 1769189964

> the volume of AI scrapers is making hosting untenable

Aside from that potential, it's also not true.

A Pentium Pro or PIII SSE with circa 1998-99 Apache happily delivers a billion hits a month w/o breaking a sweat unless you think generating pages for every visit is better than generating pages when they change.

Tenemo · 2026-01-23T18:42:54 1769193774

I think it is true that it is a real problem (EDIT: but doesn't necessarily make "hosting untenable"), but you are correct to point out that modern pages tend to be horribly optimized (and that's the source of the problem). Even "dynamic" pages using React/Next.js etc. could be pre-rendered and/or cached and/or distributed via CDNs. A simple cache or a CDN should be enough to handle pretty much any scrapping traffic unless you need to do some crazy logic on every page visit – which should almost never be the case on public-facing sites. As an example, my personal site is technically written in React, but it's fully pre-rendered and doesn't even serve JS – it can handle huge amounts of bot/scrapping traffic via its CDN.

consumer451 · 2026-01-23T21:48:46 1769204926

OK, I agree with both of you. I am an old who is aware of NGINX and C10k. However, my question is: what are the economic or technical difficulties that prevent one of these new web-scale crawlers from releasing og-pagerank-api.com? We all love to complain about modern Google SERP, but what actually prevents that original Google experience from happening, in 2026? Is it not possible?

Or, is that what orgs like Perplexity are doing, but with an LLM API? Meaning that they have their own indexes, but the original q= SERP API concept is a dead end in the market?

Tone: I am asking genuine questions here, not trying to be snarky.

arantius · 2026-01-24T02:42:45 1769222565

What prevents it is that the web in 2026 is very different than it was when OG pagerank became popular (because it was good). Back then, many pages linked to many other pages. Now a significant amount of content (newer content, which is often what people want) is either only in video form, or in a walled garden with no links, neither in or out of the walls. Or locked up in an app, not out on the general/indexable/linkable web. (Yes, of course, a lot of the original web is still there. But it's now a minority at best.)

Also, of course, the amount of spam-for-SEO (pre-slop slop?) as a proportion of what's out there has also grown over time.

IOW: Google has "gotten worse" because the web has gotten worse. Garbage in, garbage out.

consumer451 · 2026-01-24T03:32:38 1769225558

Thanks for the reply. I mentioned tech, but forgot about time. Yeah, that makes solid sense.

> Or locked up in an app...

I believe you may have at least partially meant Discord, for which I personally have significant hate. Not really for the owners/devs, but why in the heck would any product owner want to hide the knowledge of how to user their app on a closed platform? No search engine can find it, no LLM can learn from it(?). Lost knowledge. I hate it so much. Yes, user engagement, but knowledge vs. engagement is the battle of our era, and knowledge keeps losing.

r/anything is so much better than a Discord server, especially in the age of "Software 3.0"

consumer451 · 2026-01-23T22:16:32 1769206592

Please see my reply to the other child comment. That is my actual question, apologies for not being more clear.

consumer451 · 2026-01-22T21:15:17 1769116517

This is a window into the future of the fully automated brain dead world.

consumer451 · 2026-01-22T18:02:40 1769104960

https://finance.ec.europa.eu/regulation-and-supervision/savi...

consumer451 · 2026-01-20T21:29:19 1768944559

To skip the intro: https://www.youtube.com/live/dE981Z_TaVo?t=100s

This might be the most blunt and significant speech of our time. I am stunned by the intelligent honesty. Also, kudos for the difficult follow-up questions.

consumer451 · 2026-01-20T17:34:08 1768930448

Related is the PM of Canada’s speech at Davos today. I don’t think that I have heard such a blunt assessment of the past and future from a politician, ever.

This may be a historically significant speech.

https://www.youtube.com/live/dE981Z_TaVo?t=100

throw310822 · 2026-01-20T22:37:29 1768948649

Mark Carney is a great orator, I was impressed just a few days ago by his speech after meeting Xi Jinping in China:

https://www.youtube.com/watch?v=TLasfU4l__A

consumer451 · 2026-01-20T23:58:21 1768953501

Not only a great orator, but look at his resume:

https://en.wikipedia.org/wiki/Mark_Carney

This man is clearly one of the leaders of the new free world.

mdnahas · 2026-01-23T16:00:41 1769184041

I started following him after he headed the UK central bank after running Canada’s central bank!

Herring · 2026-01-21T04:05:19 1768968319

He is pissed. Damn. This will take a very long time to clean up… Assuming it’s reversible.

consumer451 · 2026-01-23T21:36:51 1769204211

I have been thinking about this speech and your comment for a couple days now.

He should be pissed. Not as a Canadian, but as someone who understands the benefits of a global system of rules that has made everyone who cooperates rich, which has included himself.

What we have witnessed under 47 is that a small group of the world's rich ideologues were so maniacal and myopic that they screwed it up for everyone.

As a relative "poor", this pisses me off as well, as peace and prosperity will likely, at least temporarily, devolve for us all as Pax/Oeconomia Americana unwinds. Utter insanity.

Herring · 2026-01-24T02:49:21 1769222961

Maybe recheck how exposed your country is. It's really bad for Canada, since ~77% of all Canadian goods exports are headed to the US. It will take a long time for them to diversify. Same with Mexico and Taiwan, and to a lesser extent Vietnam and Ireland.

Honestly as an American I'm just happy to see some open talk and movement by world leaders. This breakdown has been clear for at least 10 years, maybe 20 if you're smart, but nobody did anything. I hope Europe can learn to work with China here. I don't think the whole system is dead, they just need to find other dance partners.

baxtr · 2026-01-20T22:37:02 1768948622

"If we’re not at table, we’re on the menu"

Good speech indeed

porkbrain · 2026-01-20T22:40:00 1768948800

He quotes Václav Havel: https://en.wikipedia.org/wiki/The_Power_of_the_Powerless

consumer451 · 2026-01-20T16:01:19 1768924879

Wouldn't this be done by individual institutions and countries, not all once by "the EU?"

Evidence of that:

> Danish pension fund divesting US Treasuries

https://news.ycombinator.com/item?id=46692594

pqtyw · 2026-01-20T16:19:24 1768925964

That's a tiny barely significant amount, though.

However the amount of US treasuries Denmark holds but privately and publicly did decrease by 20% or so over the last yea which I guess is something..

dmk · 2026-01-20T16:05:08 1768925108

Fair point. Though I wonder if individual fund moves actually move the needle here or if it's mostly symbolic until it becomes a trend.

consumer451 · 2026-01-20T21:46:44 1768945604

I believe that the best political speech of our time has just been presented. [0]

I believe that you might be a fellow European. If you happen to have 30 minutes to listen, I would love to hear your feedback.

[0] https://www.youtube.com/live/dE981Z_TaVo?t=100s

consumer451 · 2026-01-19T15:20:44 1768836044

I've been think about these broad critiques of Capitalism, and while I sometimes find myself nodding in at least partial agreement, I worry that it's far too blunt a critique.

If you look at Soviet or Chinese Communism, they also stifled innovation, and they also destroyed entire ecosystems. They also had extreme concentrations of power, which allowed psychopathic leaders to commit atrocities.

If we want to come up with real long-term solutions, maybe we need to be honest about underlying human traits, and address those via systematic controls. Otherwise, it feels like we are going to keep bouncing from extreme to extreme. That tendency towards extremes seems like another easily exploited human trait that needs to be identified and addressed.

I guess my point here is that maybe it's not entirely specific systems at fault here, as much as it is universal human traits and group dynamics.

Disclaimer: I thought we had already found the beginnings of an answer, and it was Social Democracy with a regulated market economy. However, this system appears not to be extreme enough for many people to get excited about it.