Interestingly, it doesn't always condition the final output. When playing with DeepSeek, for example, it's common to see the CoT arrive at a correct answer that the final answer doesn't reflect, and even vice versa, where a chain of faulty reasoning somehow yields the right final answer.
It almost seems that the purpose of the CoT tokens in a transformer network is to act as a computational substrate of sorts. The exact choice of tokens may not be as important as it looks, but it's important that they are present.
That phenomenon and others is what made it obvious that COT is not its "thinking". I think COT is a process by which the llm expands its processing boundary, in that it allows it to sample over a larger space of possibilities. So its kind of acts like a "trigger" of sorts that allows the model to explore in more ways then without COT. First time I saw this was when I witnessed the "wait" phenomenon. Simply inducing the model to say "wait" in its response improved accuracy of results. as now the model double checked its "work". funny enough it also sometimes lead it to produce a wrong answer where otherwise it should have stuck to its guns. But overall that little wait had a net positive affect. Thats when i knew COT was not same as human thinking as we dont care about trigger words or anything like that, our thinking requires zero language (though it does benefit from language) its a deeper process. Thats why i was interested in latent processing models and foray in that matter.
But why would you want to violate the docs on something as fundamental as malloc? Why risk relying on implementation specific quirks in the first place?
I'm curious too. I find it an occasionally useful feature, but how often I use it goes down as my ability to construct better find-replace/apply-action regex goes up.
> charging for it should help make sure it continues to work
It's there a particular reason you're extending the benefit of the doubt here? This seems like the classic playbook of making something free, waiting for people to depend on it, then charging for it, all in order to maximize revenue. Where does the idea that they're really doing this in order to deliver a more valuable service come from?
Yeah. This is a reaction to providers like blacksmith or self-hosted solutions like the k8s operator being better at operating their very bad runner then them, at cheaper prices, with better performance, more storage and warm caches. The price cut is good, the anticompetitive bit where they charge you to use computers they don't provide isn't. My guess is that either we're all gonna move to act or that one of the SaaS startups sue.
I appreciate being able to pay for a service I rely on. Using self-hosted runners, I previously paid nothing for Github Actions — now I do pay something for it. The price is extremely cheap and seems reasonable considering the benefits I receive. They've shown continued interest in investing in the product, and have a variety of things on their public roadmap that I'm looking forward to (including parallel steps) — https://github.com/orgs/github/projects/4247?pane=issue&item....
Charging "more than nothing" is certainly not what I would call maximizing revenue, and even it they were maximizing revenue I would still make the same decision to purchase or abandon based on its value to me. Have you interacted with the economy before?
It's going to be fun watching HN, which is full of people who support this sort of thing (and even more extreme regulations to boot,) deal with the ramifications of this forum's guidelines and moderation policies being de facto illegal.
It won't even be "turning into Reddit" it's all going to turn into 4chan.
Consider that every time you start a session with Claude Code. It's effectively a new engineer. The system doesn't learn like a real person does, so for it to improve over time you need to manually record the insights that for a normal human would be integrated by the natural learning process.
Yes, that's exactly the problem. There's good reasons why any particular team doesn't onboard new engineers each day, going all the way back to Fred Brooks and "adding more people to a late project makes it later".
I found this article pretty interesting to think about. The ideas it discusses are adjacent to a lot of what I sometimes struggle with in communication.
Meta note: my description was accidentally a great example of what I mean
> Adjacent to (not exactly the same as, but the overlap could be nearly complete)
> a lot of (not necessarily all of, but also not explicitly excluding all of)
> what I sometimes (not necessarily always, but also not explicitly excluding always)
Considering this more, I think my purpose in this intentional ambiguity is slightly different than the purpose of "not revealing one's true position" as described in the article. Rather, the problem I'm trying to pre-empt is responses that latch onto parts of what I say that aren't perfectly precisely true, but also aren't the point of what I'm trying to communicate.
It's frustrating when I'm trying to communicate a very specific idea or message and the discourse that follows ends up not engaging with that idea, so I've come to make the specific idea clear, and keep any contextual information more ambiguous to encourage focusing on the more well-defined thing.
Unless, as a local looking for new spots to try, your first step is going to Google Map and searching "restaurants". I'm certainly guilty of this sometimes.
And then lower down we have TensorFlow 0.6.0 release.
I was considering using this feature the other day to try to get a sense of what AI discourse was like circa 2019. It all blends together after a while. I ended up doing a Twitter search for "GPT-2" ending 2019-12-31, but that's a little more specific than I want.
The HN past feature is an excellent way of seeing snapshots of history, and I wish more sites had things like this. I guess I should Archive.org a little more money.
Nice. That was a fun rabbit-hole. This is the earlier I could find. Interestingly it contains a link to HN itself. I assume this migrated from a different version of a message board?
> YouTube: identifying copyrighted material can't be an automated process. Startup disagrees.
Also kind of interesting how little HN commenting styles have changed. Aside from the subject matter, it's barely noticeable that the comments are from 2007. I don't think the same would be true of many other places round the web.
Today's front page is not a clean 10 year extrapolation from this. That's where AI is wrong. The future is weird and zig zags, it's not so linear as the Gemini generated page.
Honest question - do you think that everyone else thinks this is even REMOTELY what the front page will look like in 10 years?
I comment because I really cannot figure out why you left your comment. Do you think the rest of the commenters think this has predicted the future? It might be one thing to point out specific trends you think will not play out, or unexpected trends you think may show up that are currently left out. But to just remark that the future will contain things we cannot currently predict seems so inherently, unspokenly obvious that I just have to assume that wasn't the point of your post, and I've missed it entirely.
Sorry, I'm really not trying to be mean or anything - i'm just really confused.
Your confusion seems to stem from the assumption that, making a statement is an implicit assertion that most people believe the opposite of that statement.
In reality, statements are often made rather for the purpose of emphasis or rhetoric.
To answer your question: I think that GP mostly wanted to share the insight that the future zig-zags, which is kind of non obvious and a fun thing to think about. People often like leaving comments about interesting thoughts or ideas, even if they are only tangentially related.
This is a problem with nearly all predictions about the future. Everything is just a linear extrapolation of the status quo. How could a system have predicted the invention of the transformer model in 2010? At best some wild guess about deep learning possibilities.
Or the impact of smartphones in 2003? Sure smart phones were considered but not the entire app ecosystem and planetary behavioral adaptation.
Goddamnit I cry everytime. RethinkDB was a great document store that didn't eat your data. It got eclipsed by an outfunded (and still dangerous at the time) MongoDB.
reply