More

w4 · 2025-11-07T14:21:29 1762525289

Because of benchmarking LLMs have also been pushed towards fluency in Python, and related frameworks like Django and Flask. For example, SWE-Bench Verified is nearly 50% Django framework PR tasks: https://epoch.ai/blog/what-skills-does-swe-bench-verified-ev...

It will be interesting to see how durable these biases are as labs work towards developing more capable small models that are less reliant on memorized information. My naive instinct is that these biases will be less salient over time as context windows improve and models become increasingly capable of processing documentation as a part of their code writing loop, but also that, in the absence of instruction to the contrary, the models will favor working with these tools as a default for quite some time.

w4 · 2025-08-14T18:34:53 1755196493

Isn't this more or less what every procedural programming language is? It's especially obvious with examples like Apple's Objective-C APIs ([object doSomethingAndReturnATypeWith:anotherObject]), Cobol (a IS GREATER THAN b), or SQL (SELECT field FROM table WHERE condition), but even assembly is a mnemonic English-ish abstraction for binary.

I'm intrigued by the idea, but my major concern would be that moving up to a new level of abstraction would even further obscure the program's logic and would make debugging especially difficult. There's no avoiding the fact that the code will need to be translated to procedural logic for the CPU to execute at some point. But that is not necessarily fatal to the project, and I am sure that assembly programmers felt the same way about Fortran and C, and Fortran and C programmers felt the same way about Java and Python, and so on.

w4 · 2025-06-30T13:38:19 1751290699

It is readily understandable if you are fluent in the jargon surrounding state of the art LLMs and deep learning. It’s completely inscrutable if you aren’t. The article is also very high level and disconnected from specifics. You can skip to FAIR’s paper and code (linked at the article’s end) for specifics: https://github.com/facebookresearch/vjepa2

If I had to guess, it seems likely that there will be a serious cultural disconnect as 20-something deep learning researchers increasingly move into robotics, not unlike the cultural disconnect that happened in natural language processing in the 2010s and early 20s. Probably lots of interesting developments, and also lots of youngsters excitedly reinventing things that were solved decades ago.

godelski · 2025-06-30T20:12:20 1751314340

  > if you are fluent in the jargon surrounding state of the art LLMs and deep learning

It is definitely not following that jargon. Maybe it follows the tech influencer blog post jargon but I can definitively say it doesn't follow jargon used in research. Which, they are summarizing a research paper. Consequently they misinterpret things and use weird phrases like "actionable physics," which is self referential. "A" physics model is necessarily actionable. It is required to be a counterfactual model. While I can understand the rephrasing to clarify to a more general audience that's a completely different thing than "being fluent in SOTA work." It's literally the opposite...

Also, it definitely doesn't help that they remove all capitalization except in nouns.

w4 · 2025-05-12T15:13:16 1747062796

I’ve really been enjoying the combination of CodeCompanion with Gemini 2.5 for chat, Copilot for completion, and Claude Code/OpenAI Codex for agentic workflows.

I had always wanted to get comfortable with Vim, but it never seemed worth the time commitment, especially with how much I’ve been using AI tools since 2021 when Copilot went into beta. But recently I became so frustrated by Cursor’s bugs and tab completion performance regressions that I disabled completions, and started checking out alternatives.

This particular combination of plugins has done a nice job of mostly replicating the Cursor functionality I used routinely. Some areas are more pleasant to use, some are a bit worse, but it’s nice overall. And I mostly get to use my own API keys and control the prompts and when things change.

I still need to try out Zed’s new features, but I’ve been enjoying daily driving this setup a lot.

w4 · on Jan 3, 2025

This idea is reminiscent of the opening scene of Accelerando by Charlie Stross:

Are you saying you taught yourself the language just so you could talk to me?"

"Da, was easy: Spawn billion-node neural network, and download Teletubbies and Sesame Street at maximum speed. Pardon excuse entropy overlay of bad grammar: Am afraid of digital fingerprints steganographically masked into my-our tutorials."

…

"Uh, I'm not sure I got that. Let me get this straight, you claim to be some kind of AI, working for KGB dot RU, and you're afraid of a copyright infringement lawsuit over your translator semiotics?"

"Am have been badly burned by viral end-user license agreements. Have no desire to experiment with patent shell companies held by Chechen infoterrorists. You are human, you must not worry cereal company repossess your small intestine because digest unlicensed food with it, right?”

- https://www.antipope.org/charlie/blog-static/fiction/acceler...

Amusing to also note that this excerpt predicted the current LLM training methodology quite well, in 2005.

Terr_ · on Jan 4, 2025

More inspired by the GPL, I think, although the sketch above doesn't force the writer to put things into the public domain.

I'm imagining a separate declaration of: "Content I can sublicense from ShittyNewsLLM--which is everything made by their model--is now public-domain through me until further notice", without any need to identify specific items or rehost it myself.

I suppose the counterstrike would be for them to try to transform their own work and argue what they finally released contains some human spark that wasn't covered by the ToS, in which case there may need to be some "and any derivative work" kinda clause.

I wonder if some organization (similar to the Open Software Foundation) could get some lawyers and web-designers together to craft legally-sound site-design rules and terms-of-service, which anyone could use to protect their own blogs or web-forums.

TeMPOraL · on Jan 3, 2025

Also amusing:

> patent shell companies held by Chechen infoterrorists

This perfectly captures how both patent trolls and MAFIAA look like in my mind.

w4 · on Dec 20, 2024

The cost to run the highest performance o3 model is estimated to be somewhere between $2,000 and $3,400 per task.[1] Based on these estimates, o3 costs about 100x what it would cost to have a human perform the exact same task. Many people are therefore dismissing the near-term impact of these models because of these extremely expensive costs.

I think this is a mistake.

Even if very high costs make o3 uneconomic for businesses, it could be an epoch defining development for nation states, assuming that it is true that o3 can reason like an averagely intelligent person.

Consider the following questions that a state actor might ask itself: What is the cost to raise and educate an average person? Correspondingly, what is the cost to build and run a datacenter with a nuclear power plant attached to it? And finally, how many person-equivilant AIs could be run in parallel per datacenter?

There are many state actors, corporations, and even individual people who can afford to ask these questions. There are also many things that they'd like to do but can't because there just aren't enough people available to do them. o3 might change that despite its high cost.

So if it is true that we've now got something like human-equivilant intelligence on demand - and that's a really big if - then we may see its impacts much sooner than we would otherwise intuit, especially in areas where economics takes a back seat to other priorities like national security and state competitiveness.

[1] https://news.ycombinator.com/item?id=42473876

istjohn · on Dec 20, 2024

Your economic analysis is deeply flawed. If there was anything that valuable and that required that much manpower, it would already have driven up the cost of labor accordingly. The one property that could conceivably justify a substantially higher cost is secrecy. After all, you can't (legally) kill a human after your project ends to ensure total secrecy. But that takes us into thriller novel territory.

w4 · on Dec 20, 2024

I don't think that's right. Free societies don't tolerate total mobilization by their governments outside of war time, no matter how valuable the outcomes might be in the long term, in part because of the very economic impacts you describe. Human-level AI - even if it's very expensive - puts something that looks a lot like total mobilization within reach without the societal pushback. This is especially true when it comes to tasks that society as a whole may not sufficiently value, but that a state actor might value very much, and when paired with something like a co-located reactor and data center that does not impact the grid.

That said, this is all predicated on o3 or similar actually having achieved human level reasoning. That's yet to be fully proven. We'll see!

daemonologist · on Dec 21, 2024

This is interesting to consider, but I think the flaw here is that you'd need a "total mobilization" level workforce in order to build this mega datacenter in the first place. You put one human-hour into making B200s and cooling systems and power plants, you get less than one human-hour-equivalent of thinking back out.

w4 · on Dec 21, 2024

No you don’t. The US government has already completed projects at this scale without total economic mobilization: https://en.wikipedia.org/wiki/Utah_Data_Center Presumably peer and near-peer states are similarly capable.

A private company, xAI, was able to build a datacenter on a similar scale in less than 6 months, with integrated power supply via large batteries: https://www.tomshardware.com/desktops/servers/first-in-depth...

Datacenter construction is a one-time cost. The intelligence the datacenter (might) provide is ongoing. It’s not an equal one to one trade, and well within reach for many state and non-state actors if it is desired.

It’s potentially going to be a very interesting decade.

lurking_swe · on Dec 20, 2024

i disagree because the job market is not a true free market. I mean it mostly is, but there’s a LOT of politics and shady stuff that employers do to purposely drive wages down. Even in the tech sector.

Your secrecy comment is really intriguing actually. And morbid lol.

atleastoptimal · on Dec 21, 2024

How many 99.9th percentile mathematicians do nation states normally have access to?

w4 · on Dec 13, 2024

This is a gorgeously and uniquely designed product. Very cool.

smeeeeeeeeeeeee · on Dec 13, 2024

Thank you so much!

w4 · on Dec 11, 2024

Open source means just that: that the source is open. The OSI and co. re-defining the term to suit their ideological preferences doesn’t really change that. SQLite is open source, even if it’s not Open Source.

Edit: FSF should have been OSI, I think. Fixed.

hnarn · on Dec 11, 2024

> The OSI and co. re-defining the term

I don't know where you got this idea but it's not true. The OSI is simply defending the definition as it has been generally understood since the start of its usage in the 1980s by Stallman and others.

The only group of people "re-defining" -- quite successfully I suppose, which you are an example of -- what open source software means are those that have a profit motive to use the term to gain traction during the initial phase where a proprietary model would not have benefited them.

I don't think I need to provide concrete examples of companies that begin with an open source licensing model, only to rug-pull their users as soon as they feel it might benefit them financially, these re-licensing discussions show up on HN quite often.

pjmlp · on Dec 11, 2024

In the 1980s we had Shareware, Beerware, Postware, whateverWare, Public Domain, "send me a coffee", "I don't care" open source, magazine and book listings under their own copyright licenses (free for typing, not distribution).

Most of us on 8 and 16 bit home computers didn't even knew "Stallman and others" were.

Additionally, GCC only took off after Sun became the first UNIX vendor to split UNIX into two SKUs, making the whole development tools its own product. Others quickly followed suit.

Also, in regards to Ada adoption hurdles, when they made an Ada compiler, it was its own SKU, not included on the UNIX SDK base package.

hnarn · on Dec 12, 2024

I don't really understand what your point is, but shareware has never been "open source".

Nobody's arguing that public domain code, or the MIT, or whatever is not open source; it's obviously open source because it's _more_ free than the GPL.

Sure, devs can call any "source available" project "open source" because it gets people interested even though you have zero interest in using an open source development model or allowing others to make changes to the code. Devs can also expect well deserved flak from people who understand that "open source" is not marketing speak.

sowbug · on Dec 11, 2024

I don't understand why OSI didn't pick an actually trademarkable term and license its use to projects that meet its ideals of open-sourceness. OSI knows it has no right to redefine common language and police its usage, any more than a grammar pedant has the right to levy fines against those of us who split infinitives.

(To be fair to OSI, I've never seen any of their representatives do this. But the internet vigilante squad they've spawned feels quite empowered to let us know we've broken the rules.)

w4 · on Dec 5, 2024

The 2024 presidential election was won by the candidate who spent about 1/3rd less than their opponent, and we’ve seen many successful campaigns in the past decade funded by small donations beat corporate backed candidates. Funding isn’t everything, and it’s a cop out to co-sign vigilantism on such a glib basis: https://www.opensecrets.org/2024-presidential-race

Rule of law is a precious thing, even if it’s imperfect, as all human systems will inevitably be. We shouldn’t be cavalier about discarding it. The alternatives are much worse.

JohnBooty · on Dec 5, 2024

Nobody is claiming that outspending your rivals is some kind of automatic electoral win but nice job, uh, over there. (gestures at the remains of the strawman you tore apart)

Back on topic...

The relevant campaign spending figures here would be "minimum amounts of funding to run a viable campaign" and not "how much did the winner spend."

What I, unlike our recently deceased strawman, was pointing out is that without backing from corporations and the people who own them, your chances of winning elections to higher offices (congress, presidency) are close to zero. And the odds of enough similarly-untainted, like-minded people getting elected concurrently (or at least running campaigns popular enough to pressure those in office) is even more astronomically low.

But....

Campaign funding issues only scratch the surface of what it would take to achieve any kind of real reform re: freeing the government from corporate influence.

Because even if you, say, pull off some kind of underdog miracle and get elected to Congress without completely selling out... you're not going to accomplish shit without political capital, somehow bucking the other 534 members of Congress who have sold out.

ahahahahah · on Dec 5, 2024

> was won by the candidate who spent about 1/3rd less than their opponent

I guess the truth of that statement depends on whether or not you consider the $36 billion dollars that Musk has lost on Twitter to have been "spent" or not.

harimau777 · on Dec 6, 2024

Is that the case if you include the fact that Fox News is effectively a wing of the Republican party? It seems to me that Republicans don't need to spend more since they have an entire outside propaganda apparatus.

It's also not just funding. It's funding, gerrymandering, the electoral college, lobbying, the Supreme Court, voter suppression, etc.

w4 · on Nov 26, 2024

Those people just use Terminal.app or Windows Terminal. They're not installing alternative terminals emulators.