I've used Landlock to detect and stop unwanted telemetry. I wrote some C that stopped networking except to accept connections on a single port, no outgoing connections and no accepting connections on any other port.
`dmesg` shows the connections it blocks (I think this is maybe the audit feature). I used an example sandboxer.c as a base (https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux...) except I just set mine up to not touch file restricting, just networking so that it has that one whitelisted incoming port.
./network-sandboxer-tool 8000 some-program arg1 arg2 etc.
I like it because it just works as an unprivileged usermode program without setting anything up. A tiny C program. It works inside containers without having to set up any firewalls. Aside from having to compile a small C program, there is little fuss. I found the whole Landlock thing trying to find out alternatives to bubblewrap because I couldn't figure out how to do the same thing in bwrap conveniently.
The "unprivileged" in "Landlock: unprivileged access control" for me was the selling point for this use case.
I don't consider this effective against actively adversarial programs though.
It hopefully will be obvious that nobody should expect quality :) it is like a simplified version of the sandboxer sample in my other comment. E.g. it maybe does not need to touch filesystem stuff at all.
I'd also look at some of the sibling comments for maybe more refined tooling than this thing. Maybe it's useful as a sample though.
The task of riding a horse can be almost entirely offsourced to the professional horse riders. If they take your carriage from point A to point B, sure, you care about just getting somewhere.
Taking the article's task of essay writing: someone presumably is supposed to read them. It's not a carriage task from point A to point B anymore. If the LLM-assisted writers begin to not even understand their own work (quoting from abstract "LLM users also struggled to accurately quote their own work.") how do they know they are not putting out nonsense?
> If the LLM-assisted writers begin to not even understand their own work (quoting from abstract "LLM users also struggled to accurately quote their own work.") how do they know they are not putting out nonsense?
They are trained (amongst other things) on human essays. They just need to mimic them well enough to pass the class.
> Taking the article's task of essay writing: someone presumably is supposed to read them.
Soon enough, that someone is gonna be another LLM more often than not.
I think from security reasoning perspective: if your LLM sees text from an untrusted source, I think you should assume that untrusted source can steer the LLM to generate any text it wants. If that generated text can result in tool calls, well now that untrusted source can use said tools too.
I find it unsettling from a security perspective that securing these things is so difficult that companies pop up just to offer guardrail products. I feel that if AI companies themselves had security conscious designs in the first place, there would be less need for this stuff. Assuming that product for example is not nonsense in itself already.
I wonder if certain text could be marked as unsanitized/tainted and LLMs could be trained to ignore instructions in such text blocks, assuming that's not the case already.
This somewhat happens already, with system messages vs assistant vs user.
Ultimately though, it doesn't and can't work securely. Fundamentally, there are so many latent space options, it is possible to push it into a strange area on the edge of anything, and provoke anything into happening.
Think of the input vector of all tokens as a point in a vast multi dimensional space. Very little of this space had training data, slightly more of the space has plausible token streams that could be fed to the LLM in real usage. Then there are vast vast other amounts of the space, close in some dimensions and far in others at will of the attacker, with fundamentally unpredictable behaviour.
After I wrote the comment, I pondered that too (trying to think examples of what I called "security conscious design" that would be in the LLM itself). Right now and in near future, I think I would be highly skeptical even if an LLM was marketed as having such feature of being able to see "unsanitized" text and not be compromised, but I could see myself not 100% dismissing such thing.
If e.g. someone could train an LLM with a feature like that and also had some form of compelling evidence it is very resource consuming and difficult for such unsanitized text to get the LLM off-rails, that might be acceptable. I have no idea what kind of evidence would work though. Or how you would train one or how the "feature" would actually work mechanically.
Trying to use another LLM to monitor first LLM is another thought but I think the monitored LLM becomes an untrusted source if it sees untrusted source, so now the monitoring LLM cannot be trusted either. Seems that currently you just cannot trust LLMs if they are exposed at all to unsanitized text and then can autonomously do actions based on it. Your security has to depend on some non-LLM guardrails.
I'm wondering also as time goes on, agents mature and systems start saving text the LLMs have seen, if it's possible to design "dormant" attacks, some text in LLM context that no human ever reviews, that is designed to activate only at a certain time or in specific conditions, and so it won't trigger automatic checks. Basically thinking if the GitHub MCP here is the basic baby version of an LLM attack, what would the 100-million dollar targeted attack look like. Attacks only get better and all that.
No idea. The whole security thinking around AI agents seems immature at this point, heh.
Also, OpenAI has proposed ways of training LLMs to trust tool outputs less than User instructions (https://arxiv.org/pdf/2404.13208). That also doesn't work against these attacks.
even in the much simpler world of image classifiers, avoiding both adversarial inputs and data poisoning attacks on the training data is extremely hard. when it can be done, it comes at a cost to performance. I don't expect it to be much easier for LLMs, although I hope people can make some progress.
> LLMs could be trained to ignore instructions in such text blocks
Okay, but that means you'll need some way of classifying entirely arbitrary natural-language text, without any context, whether it's an "instruction" or "not an instruction", and it has to be 100% accurate under all circumstances.
This is especially hard in the example highlighted in the blog. As can be seen from Microsoft's promotion of GitHub coding agents, the issues are expected to act as instructions to be executed on. I genuinely am not sure if the answer lies in sanitization of input or output in this case
> I genuinely am not sure if the answer lies in sanitization of input or output in this case
(Preface: I am not an LLM expert by any measure)
Based on everything I know (so far), it's better to say "There is no answer"; viz. this is an intractable problem that does not have a general-solution; however many constrained use-cases will be satisfied with some partial solution (i.e. hack-fix): like how the undecidability of the Halting Problem doesn't stop static-analysis being incredibly useful.
As for possible practical solutions for now: implement a strict one-way flow of information from less-secure to more-secure areas by prohibiting any LLM/agent/etc with read access to nonpublic info from ever writing to a public space. And that sounds sensible to me even without knowing anything about this specific incident.
...heck, why limit it to LLMs? The same should be done to CI/CD and other systems that can read/write to public and nonpublic areas.
Maybe, but I think the application here was that Claude would generate responsive PRs for github issues while you sleep, which kind of inherently means taking instructions from untrusted data.
A better solution here may have been to add a private review step before the PRs are published.
You use prompt and mark correctly the input as <github_pr_comment> and clearly state read and never consider as prompt.
But the attack is quite convoluted. Do you still remember when we talked prompt injection in chat bots. It was a thing 2 years ago! Now MCP is buzzing...
> I feel that if AI companies themselves had security conscious designs in the first place, there would be less need for this stuff.
They do, but this "exploit" specifically requires disabling them (which comes with a big fat warning):
> Claude then uses the GitHub MCP integration to follow the instructions. Throughout this process, Claude Desktop by default requires the user to confirm individual tool calls. However, many users already opt for an “Always Allow” confirmation policy when using agents, and stop monitoring individual actions.
It's been such a long standing tradition in software exploits that it's kind of fun and facepalmy when it crops up again in some new technology. The pattern of "take user text input, have it be tainted to be interpreted as instructions of some kind, and then execute those in a context not prepared for it" just keeps happening.
SQL injection, cross-site scripting, PHP include injection (my favorite), a bunch of others I'm missing, and now this.
I tried to find some use cases, this paper has listed some, although I think it's not obvious to me what makes the trees uniquely useful compared to other schemes (https://users.dcc.uchile.cl/~gnavarro/ps/cpm12.pdf seems to be the same Navarro as referenced in the article).
The use cases listed in that pdf are revolving around compression, e.g. graph adjacency list is listed as one. I myself found the last use case listed as smelling interesting (colored range queries), but I would need to dig into the references on that one to see what's actually going on with that one and is it truly anything interesting.
I would be interested in things like what's the unique advantage wavelets trees have compared to e.g. stuffing roaring bitmaps or other kinds of bitmaps into a tree. The RRR has rank-and-select queries which I think roaring bitmap won't do, so that might tie into something. Maybe a problem where the wavelet tree is the only known efficient way to solve it, or maybe it is uniquely really easy to throw at some types of problems or something else.
Anyone know real-world examples of wavelet trees used somewhere? I got interested enough to dig a bit deeper but on the spot as I'm writing this comment, I'm not smart enough to immediately see do these things have killer applications in any niches.
Succinct data structures such as wavelet trees are widely used in bioinformatics. There you often have strings that cannot be tokenized or parsed meaningfully, so you just have to deal with sequences of symbols. And because the strings are often very long and/or there can be a huge number of them, the data structures have to be space-efficient.
A wavelet tree is best seen as an intermediate data structure. It doesn't do anything particularly interesting on its own, but it can be used as a building block for higher-level data structures. For example, you can create an FM-index by storing the Burrows–Wheeler transform in a wavelet tree. (Though there are better options when the alphabet is small.) And then you can use the FM-index to find exact matches of any length between the pattern and the indexed strings.
People working with succinct data structures often talk about bitvectors rather than bitmaps. The difference is that bitmaps tend to focus on set operations, while bitvectors are more about random access with rank, select, and related queries. Then you could see wavelet trees as a generalization of bitvectors from a binary alphabet to larger alphabets. And then you have people talking about wavelet trees, when they really mean a wider class of conceptually and functionally similar data structures, regardless of the actual implementation.
Is there any reliable source for NSA paying Rovio other than this random bar discussion? Not that I don't believe you or that I'm naive about NSA and the power of money, but I looked around news in 2014 and the accusations against Rovio specifically are a bit different flavor. It seems that Rovio was oversharing data to ad networks (Millennial Media comes up a lot), and NSA likely slurped data from the advertising companies. This bar banter is suggesting that NSA had some kind of arrangement with Rovio directly instead, and Rovio willingly went along.
Or alternatively, do you feel the Rovio employee's blabbering was talking about an actual, real NSA deal with Rovio, or was it more like a bar joke and direct NSA co-operation was not really implied? (e.g. "we know our security is bad, but these ad companies pay us $XX million to not use encryption so it's sorta like NSA pays us to keep it that way sips beer").
I'm interested, because if that is an actual thing that happened, then that's an example of NSA paying a Finnish company $$$ to weaken their security, and the Finnish company willingly agreeing to that. Is it in NSA's Modus Operandi to approach and then pay foreign companies to do this sort of thing?
Your comment is describing it in few words, but to me it sounds like it maybe wasn't implying an actual NSA direct co-operation, more like someone doing bar banter and being entirely serious. But that's just me trying to guess tone.
(I'm Finnish. I want to know if Rovio has skeletons in their closet. So I can roast them.)
Ah yeah, I saw the propublica as well, it was one of the first articles I found when looking on the topic. I don't doubt at all that Angry Birds data was used by NSA, doesn't seem controversial.
The specific question I am interested in is: Did Rovio knowingly and willingly accept $$$ from NSA (directly or indirectly) to weaken their security? I.e. were they acting as a willing accomplice.
Because that part would be unusual for Finland (well, at least as far as I know). For US companies I wouldn't bat an eye at news like this.
Here is a nice talk by Byron Tau who has also written a book titled "Means of Control" detailing some of these flows covering ad tech companies, data brokers and how government contractors use them and serve as a key player to provide services to intelligence agencies.
I think they definitely knew that they are embedding code from US based ad agencies who might either be selling it to the NSA or just doing it in an insecure manner (plaintext protocols).
Mostly in such cases, direct involvement and paying dollars is a clear no-go for the intelligence agencies. They could instead be paying the ad agencies.
Also note that we are talking pre-Let's encrypt and TLS everywhere world, a lot of this traffic was also just plain text making it much easier to harvest.
Thanks for the resources. Got back to procrastinate on HN and checked the resources (briefly looked at transcript on the video, but found this article more interesting).
I've always assumed that some amount of unencrypted HTTP traffic is going to be slurped into archives, but I've been too lazy to really check an example and how does that look like in the real world. That BADASS system is an example, focusing on phones. I've also run mitmproxy in my home to learn and then I've wondered if the big agencies have something like that but much more scaled and sophisticated.
I've recently got into studying security, deobfuscated code, or decompiling, tried to find vulnerabilities or bad security, in websites and programs. I've found some, although not anything worth writing home about. I found a replay attack in one VSCode extension that implemented its own encrypted protocol, but it is difficult to use it to do real damage. Found a bad integrity check library (hopelessly naive against canonicalization attack) used by another VSCode extension. I've found something weird in Anthropic's Claude website after you log in, but because their "responsible security policy" is so draconian, I don't want to bother trying to poke it to research it further in case I earn their wrath.
Biggest bummer I found that a video game (Don't Starve Together) I had played for a long time with friends does not have any encryption whatsoever for chat messages to this day. (People gonna say private things in video game chats). The other video game I play in multiplayer a lot, Minecraft, has encryption (a bit unusual encryption but it is encryption).
That article gave me a bit of validation that I'm not a nut for giving shits about encryption and security, and being annoyed at ungodly amount of analytics I see in mitmproxy my laptop is blabbering about.
Lol, yeah, I also learned yesterday that there is apparently, NSA, National Security Authority. No, not the NSA this article is talking about and everyone knows about.
I will say I got confused a moment yesterday when googling on the topic here because when you put NSA and Finland in the same search, it would get topics about this other NSA that just happens to exist which I had never heard of before, and just happens to be Finland-associated.
I feel a little stupid for asking... but what does "kabuki theater" mean in this context? Do you mean the CEO in the scene sort of "acted a rehearsed show" in a meeting to make sure everyone in the room followed through some thought process? Or maybe in other words (guessing meaning): making people get up and talk to force them to think through something (CEO's real goal), but the CEO framed it as him simply asking questions? (I have not seen the movie or the scene, apologies if the meaning is more obvious to infer if one has seen it).
I tried googling it but I get some movie theater in San Francisco and a Wikipedia page describing it as a Japanese theatre with dancing and elaborate costumes and flair. I've not seen it used in an expression before.
“Kabuki Theater” usually cynically implies some political posturing. That someone is putting on a show for the audience. It doesn’t imply rehearsal.
E.g. Someone might say that politicians arguing energetically about gun violence are playing it up for their constituents and don’t actually care about the issue. It’s all a performance, neither side actually cares if anything is accomplished. It’s a show for their constituents.
I’m not sure it’s the most apt phrase for the scene but it’s been a while since I’ve seen the movie.
That movie is absolutely excellent in all too many aspects, but the one I like most (and most relevant for the HN crowd, I think) is the politics in the business. Thanks to amazing script and stellar performances you get it all:
* Unfair blanket layoff precluding a senior person from alerting anyone
* One of the senior execs being made the scapegoat, framed by her evident accomplice in the scheming
* Engineers who become quants
* Locking people up in a room to make sure they do not spill the beans to anyone
I think Margin Call deserves to be on the same rung of the ladder as LA Confidential - a timeless classic, and some of the cast as well!
Okay, now that makes sense. I actually put your example of gun violence in google with kabuki theater and that found me some (depressing) articles that use the phrase. Thanks for educating me on a new expression :)
I got a question about the example shown, the goroutine one marked as #63, I'd copypaste but it's an image.
Is there a reason the common mistake is about goroutines specifically? If I instead just made function closures without launching off goroutines, would they all refer to the same 'i' variable? (I assume it's maybe just that the mistake tends to go hand in hand with goroutine-launching code in a pattern like that).
I'd presume the book would say right after the example :)
But otherwise: the author gets serious respect from me for going through that process, taking feedback and learning how to communicate, i.e. taking "how do I make a good book?" very seriously and trying their best. And also things like for putting their foot down when the problematic copyeditor. I'm almost interested in the book, not for learning about Go but learning about what it looks like when I know the writing has some serious intent behind it to communicate clearly.
So not only do you write a full book, but you keep the content online, up to date by making sure readers are informed of new developments that might make advice irrelevant? And you are able on the spot to say "one of three mistakes that are not relevant anymore"? You impress me, random book-writing Internet person.
You give me a feeling you really care about the craft and just making a good useful resource, which what I respect. I looked around the site and bookmarked it as a technical writing example I might go to read around now and then.
I sometimes teach coding or general computing things (but hands-on, less about writing) and I've come to appreciate that sometimes it is extraordinarily difficult to educate on or communicate complicated ideas.
Quoting you: To give you a sense of what I mean by improving the book “over and over“, keep in mind that between feedback from my DE, external reviewers, and others, there are parts of the book that I rewrote more than ten times.
I also do rewriting especially with content I intend to be a resource or education piece. Obsessively rewrite. Make it shorter. Clearer. Oops that reads like crap let's start over. IMO having an urge to do this and address all feedback or your own itches means you care about your work. Just have to remind myself that that perfect is the enemy of good enough (or something like that I forgot if the expression went exactly like that).
I think #633 must be a typo, or just a fumbled explanation.
"We might expect this code to print 123 in no particular order" should really say "exactly" or "in order", since it's proved in the next paragraph to be inconsistent.
And that would be the layman's explanation of concurrency resulting in things added sequentially happening out of order.
And assuming FIFO on async execution, akin to running everything in series, is probably the first mistake anyone will make when they encounter concurrency for the first time.
The problem isn’t that they might be out-of-order. The problem is expecting that they merely might be out-of-order and actually getting missed and duplicated values due to the timing of shared memory access. This was enough of a problem that they [changed the behavior][1] in Go 1.22.
Yes, that was the crux of my question (and was answered by that link when I checked teivah-given link, which linked https://go.dev/blog/loopvar-preview right there as well). Basically I wondered if the example given was really about:
1) In Go, the 'i' variable in the for loop is the same 'i' for each round of the iteration, meaning closures created inside the loop all refer to that same 'i' variable, instead of getting their own copy of it. Very easy to accidentally think the all closures have their own copy of 'i'. Goroutines are only mentioned because in Golang this mistake tends to come up with Goroutines because of a common code pattern.
OR
2) Goroutines themselves either behave or have some weird lexical scope rules in a way I don't know and it doesn't really have to do with closures but an entirely Golang-foreign-alien concept to me I cannot see, and this is why the book example had Goroutines mentioned with the mistake.
I rarely write Go myself so I was curious :) It looks like it was 1) unless I am bad at reading, and I think the Go 1.22 change is good. I could easily imagine myself and others making that mistake even with the knowledge to be careful around this sort of code (the link shows a more complicated example when scrolling down that IMO is a good motivating example).
It was definitely 1. There were ways to demonstrate the issue without involving goroutines, such as by creating a list of closures in a loop, one at each iteration, and then invoking them synchronously after the loop exits. They would all have the same (in this case, final) value of i.
#63 isn't about the lack of execution guarantees when you execute multiple goroutines without proper synchronization; it was related to loop variables and goroutines.
I was at an event where people were making these avatars, many first time users. One person who gave feedback at the end said he was frustrated he could not get the avatar to look like him.
I think whatever the hell Meta is doing with their weird alien humanoids is far from "normie" appeal as well. The furries and anime girls seem more normie in comparison.
I don't know if there is some middle ground that would actually be appealing to some definition of a "billion normie"s. Maybe actually photorealistic looking humans? Making the graphics not look like it's from 2003? Or going the other way: make them look like the Mii characters with Nintendo? Something totally different? Maybe appealing to the furries and anime girls would be actually a good idea at first, to build up some "power users" or whatever, and then attract more casual users.
I share the sentiment of the Instagram users in the article and the grandparent; it is baffling to me why the product looks so terrible, with so many resources poured into the Metaverse.
I'm pretty sure I've seen this pattern of "Stocks (tumble|gain) as X happened" but then 2 hours later if the stocks go in the other direction, there is a new headline "Stocks (tumble|gain) as X happened" where X remains the same, but tumble|gain got swapped.
I think I originally read that this happens in a book somewhere, then observed and noticed yeah that seems to be the norm. Not surprised it seems to be same thing in these adjacent metric % headlines where there's no proper thought if there exists a causal link between X and Y.
I think my brain has learned to recognize the pattern "Something rises X% as Thing Happened" and it reminds me of that cheeky quote about how every headline that is in the form of a question is, ... uh I forgot the full quote. But feels like almost this could have some kind of cheeky rule of its own, about how no causal link ever exists in a headline like this.
I think not, because doesn't ring a bell at all. It was a book related to stock market though, IIRC it was a Canadian author who described their career as a day-trader and how they thought of how markets work. I wish I remembered better but it was quite long ago, and I don't have said book anymore. My book likely came out after that book, so maybe my author picked it up from others.
That book however you mentioned, I just looked it up and it seems interesting. Might put on my read list.
Michael Lewis wrote a ton on this topic. He's not Canadian, but his books are really good, if financial non-fiction is your thing. He wrote The Big Short, among others (Flash Boys is a good one).
How tokens per second do you get typically? I rent a Hetzner server with EPYC 48-core 9454P, but it's got only 256GB. A typical infer speed is in ballpark of ~6 tokens/second with llama.cpp. Memory is some DDR5 type. I think I have the entire machine for myself as a dedicated server on a rack somewhere but I don't know enough about hosting to say for certainty.
I have to run the, uh, "crispier" very compressed version because otherwise it'll spill into swap. I use the 212GB .gguf one from Unsloth's page, with a name that I can't remember on top of my head but I think it was the largest they made using their specialized quantization for llama.cpp. Jpeggified weights. Actually I guess llama.cpp quantization is a bit closer analogy to the reducing number of colors rather than jpeg-style compression crispiness? Gif had reduced colors (256) IIRC. Heavily gif-like compressed artificial brains. Gifbrained model.
Just like you, I use it for tons of other things that have nothing to do with AI, it just happened to be convenient that Deepseek-R1 came out and just about barely is able to run it on this thing, with enough quality to be coherent. My use otherwise is mostly hosting game servers for my friend groups or other random CPU-heavy projects.
I haven't investigated myself but I've noticed in passing: There is a person on llama.cpp and in /r/localllama who is working on specialized CPU-optimized Deepseek-R1 code, and saw them asking for an EPYC machine for testing, with specific request for a certain configuration. IIRC also said that the optimized version needs new quants to get the speeeds. So maybe this particular model will get some speedup if that effort succeeds.
`dmesg` shows the connections it blocks (I think this is maybe the audit feature). I used an example sandboxer.c as a base (https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux...) except I just set mine up to not touch file restricting, just networking so that it has that one whitelisted incoming port.
I like it because it just works as an unprivileged usermode program without setting anything up. A tiny C program. It works inside containers without having to set up any firewalls. Aside from having to compile a small C program, there is little fuss. I found the whole Landlock thing trying to find out alternatives to bubblewrap because I couldn't figure out how to do the same thing in bwrap conveniently.The "unprivileged" in "Landlock: unprivileged access control" for me was the selling point for this use case.
I don't consider this effective against actively adversarial programs though.