If you're curious to play around with it, you can use Clancy [1] which intercepts the network traffic of AI agents. Quite useful for figuring out what's actually being sent to Anthropic.
If only there were some sort of artificial intelligence that could be asked about asking it to look at the minified source code of some application.
Sometimes prompt engineering is too ridiculous a term for me to believe there's anything to it, other times it does seem there is something to knowing how to ask the AI juuuust the right questions.
Something I try to explain to people I'm getting up to speed on talking to an LLM is that specific word choices matter. Mostly it matters that you use the right jargon to orient the model. Sure, it's good and getting the semantics of what you said, but if you adjust and use the correct jargon the model gets closer faster. I also explain that they can learn the right jargon from the LLM and that sometimes it's better to start over once you've adjusted you vocabulary.
GenAI was built on an original sin of mass copyright infringement that Aaron Swartz could only have dreamed of. Those who live in glass houses shouldn't throw stones, and Anthropic may very well get screwed HARD in a lawsuit against them from someone they banned.
Unironically, the ToS of most of these AI companies should be, and hopefully is legally unenforceable.
The 'experiment' isn't the issue. The problem is the entire culture around it. LLM tools are being shoved into everything, LLMs are soaking up trillions in investment, engineers are being told over and over that everything has changed and this garbage is making us obsolete, software quality is decreasing where wide LLM usage is being mandated (eg. Microsoft). Gas Town does not give the vibe of a neutral experiment but rather looks be a full-on delve into AI psychosis with the way Yegge describes it.
To be clear, I think LLMs are useful technology. But the degree of increasing insanity surrounding it is putting people off for obvious reasons.
I share the frustration with the hype machine. I just don't think a guy with a blog is an appropriate target for our frustration with corporate hype culture.
The experiment is fine if you treat it as an experiment. The problem is the state of the industry where it's treated as serious rather than silly — possibly even by Steve himself.
> Ok but this entire idea is very new. Its not an honest criticism to say no one has tried the new idea when they are actively doing it.
Not really new. Back in the day companies used to outsource their stuff to the lowest bidder agencies in proverbial Elbonia, never looked at the code, and then panickedly hired another agency when the things visibly were not what was ordered. Case studies are abound on TheDailyWTF for the last two decades.
Doing the same with agents will give you the same disastrous results for comparably the same money, just faster. Oh and you can't sue them, really.
Fair point on the Elbonia comparison. But we can't sue the SQLite maintainers either, and yet we trust them with basically everything. The reason is that open source developed its own trust mechanisms over decades. We don't have anything close to that with LLMs today. What those mechanisms might look like is an open question that is getting more important as AI generated code becomes more common.
> But we can't sue the SQLite maintainers either, and yet we trust them with basically everything.
But you don’t pay them any money and don’t enter into contractual relationship with them either. Thus you can’t sue them. Well, you can try, of course, but.
You could sue an Elbonian company, though, for contract breach. LLMs are like usual Elbonian quality with two middlemen but quicker, and you only have yourself to blame when they inevitably produce a disaster.
> saying that Yegge hasn't built real software is just not true
I mean... I feel like it's somewhat telling that his wikipedia page spends half its words on his abrasive communication style, and the only thing approximating a product mentioned is a (lost) Rails-on-Javascript port, and 25 years spent developing a MUD on the side.
Certainly one doesn't get to stay a staff-level engineer at Google without writing code - but in terms of real, shipping software, Yegge's resume is a bit light for his tenure in BigTech
Interesting direction but the 98.8% FPR in Table 1 seems like a dealbreaker. Anyone understand what's going on with the contradictory results between the text and tables?
> Empirically, CTVP attains very good detection rates with reliable false positives
A novel use of the word "reliable"? Jokes aside, either they mean the FPR as the opposite of what you'd expect, the table is not representative of their approach, or they're just... really optimistic?
The article's Karim Khan example pretty deeply undercuts the thesis. Losing access to your bank account is the actual coercive power. Losing a Microsoft email is an inconvenience in comparison.
If your business has everything on GCP/AWS/Azure (which is very common) and the Americans choose to weaponinse US tech against your country or business, then unless you have non-US backups you are probably dead and all of your employees unemployed. If you are a state, all of your services and functions are probably dead and you have to rebuild from nothing. That is certainly true of my company and there are some mutterings starting where I am internally about worst case disaster recovery if suddenly one of these suppliers just disapeared.
In this new world you cannot trust that this will not happen. As a European relying on the Americans is honestly probably little better than relying on the Russians and probably on par with relying on the Chinese in terms of risk profile. Note we are actually for all intents and purposes at war with Russia.
The amount of leverage the Americans have over Europe is insane, and every captial should be trying to mitgate that risk asap.
Microsoft executives under oath said that they will not be able to honor those contracts if there is pressure from the US administration.
We should know this, but we keep forgetting: laws, contracts, courts etc always bow before political and military might.
In peacetime, we delude ourselves into thinking it aint so.
The situation is now clear as day. What op stated is 100 percent correct.
The US will have successfully invaded an EU country by 2027.
They will, if it comes to this, immediately and successfully weaponize all three hyperscalers.
It is abundantly clear where thinks are going.
If any country, organization or company is not prepared for this by mid 2026, they are blind and deaf
You are correct. In Europe, governments and businesses should treat the US as a hostile foreign country, and relying on anything from there is a massive risk.
The only thing is that weaponizing the hyperscalers would also be disastrous for the hyperscalers. They would be liable to lose their assets in Europe, access to European markets, etc and so on. Which would as a consequence cause a tangible harm to the US economy itself.
Not that in Europe we should rely on it for anything. Any business is wise to move away from any sort of dependency that is subject to US pressure. Governments in particular should consider it a matter of life and death.
We (Polish) have been raising an alarm about Russia since the first Chechen war and it took additional dozen+ years and a land invasion of a European country before countries in Western EU woke up.
Do you think they are going to be quicker reacting to danger from the other side?
I highly doubt it. EU is like a huge steam ship. It takes a lot of effort to turn it. But once it gets going good luck stopping it. This will have consequences for the EU-Us relations for the rest of this century.
I fact it is exactly what a Russian agent if he managed to become a president of US would do. A Putin's wet dream basically. Be hostile enough towards Russia to preserve appearances - seize a tanker or two, while undermining long term US and EU interests (the interests of these two are naturally aligned very well, it takes much more than an idiot to drive a wedge between them).
The thing is that the EU is a complex structure. The interests of countries such as Poland, Italy, Germany and Ireland differ wildly, which is why things are so slow to maneuver, politically speaking.
I always considered that the over reliance on US a weakness. It was comfortable because it postpones some difficult discussions (for example, in terms of defense and military spending it is completely bonkers for the EU to not act as a federal entity). Since this subject is thorny, it was alright to rely on the US for defense and just kick this can down the road.
The US becoming hostile at least forces the countries in the EU to face reality a little, and perhaps speed some things up (see for example the recent EU-Mercosur trade agreement).
The other factor is that both Russia and the US have people 'on the inside' in the EU governments. They bought them. They own them and they do what they are told.
> If your business has everything on GCP/AWS/Azure (which is very common) and the Americans choose to weaponinse US tech against your country or business,
these companies have datacenters in Europe too. It is not wild to think that if push comes to shove and US cut off Europe, then Europeans can just take control over those European data centers and restore access to GCP/AWS/Azure in Europe because these datacenters are on their soil and predominantly employing Europeans.
>. It is not wild to think that if push comes to shove and US cut off Europe, then Europeans can just take control over those European data centers and restore access to GCP/AWS/Azure in Europe because these datacenters are on their soil and predominantly employing Europeans.
Good luck with that. Those systems are extremely interconnected. We should (and are) be building sovereign EU equivalents to not just cloud providers but also major services like google/ms 365 and so on.
EU need to start with own PC hardware factories first. And PC compatible designs. What is unlikely - on first sight of troubles they will buy everything from US. As all good 3rd Word countries do.
There are plenty banks owned and operated within the EU. One bank folded for US pressure but when push comes to shove the EU can force banks in the EU to uphold EU rules and regulations.
That's not the case for digital infrastructure like Google Workspace, Google cloud, Office 365, AWS, etc.
> when push comes to shove the EU can force banks in the EU to uphold EU rules and regulations.
This made me realize that many people who are extremely critical of the power the EU has, have no idea how much that power is often protecting them.
This is not a dismissal of the fact that it's absolutely critical to stay vigilant about how that power is used. But it's quite clear that without that power, the US would've abused theirs way more within Europe.
When the US sanctioned Hong Kong’s Chief Executive in 2020, because of a law allowing extradition to China, no single bank was letting her open an account, including Chinese ones. She was receiving her salary fully in cash.
The EU compelling banks to do business despite US sanctions seems pretty unlikely even if relations continue to degrade.
>Losing a Microsoft email is an inconvenience in comparison.
Losing access to data is potentially worse than losing access to your bank account. I doubt Microsoft will let you grab a copy of all your emails after they block/ban you.
Brand name pharmaceuticals are sort of a different thing. Brand names must comply with the naming guidelines of the FDA, European Medicines Agency, and HealthCanda simultaneously. In practice, this makes it tricky to use actual words. So my companies adopt an 'empty vessel' naming approach. The empty vessels are nonsense words that (1) invoke an emotion (wegovy is a good example), (2) can be trademarked, and (3) it can survive brand pressure.
I find it very unlikely that it would be trained on that information or that anthropic would put that in its context window, so it's very likely that it just made that answer up.
No, it did not make it up. I was curious so I asked it asked it to imitate a posh British accent imitating a South Brooklyn accent while having a head cold and it explained that it didn't have have fine grained control over the audio output because it was using a TTS. I asked it how it knew that and it pointed me towards [1] and highlighted the following.
> As of May 29th, 2025, we have added ElevenLabs, which supports text to speech functionality in Claude for Work mobile apps.
Tracked down the original source [2] and looked for additional updates but couldn't find anything.
If the ads are just brought in as a stream of text from the same endpoint that's streaming you the response you're wanting, how can that be blocked in the browser anyway?
Another local LLM extension that reads the output and determines if part of it is too "ad-ey" so it can hide that part?
It will depend on how they implement the sponsored content. If there are regulations that require marking it as sponsored, that makes it easy to block. If not, then sure maybe via LLMs.
These numbers aren't that crazy when contextualized with the capex spend. One hundred million is nothing compared to a six hundred billion dollar data center buildout.
Besides, people are actively being trained up. Some labs are just extending offers to people who score very highly on their conscription IQ tests.
reply