More

fenomas · 2025-12-06T04:33:12 1764995592

I'm confused, then why does your article throw shade at both the protection software and the VST?

It sounds like you didn't find any issues with either of them, except that the VST vendor chose not to protect the thing you were hoping to crack?

VoidWhisperer · 2025-12-06T06:18:45 1765001925

I think he should be mainly throwing it at the VST vendor, as opposed to the protection software, since the main issue in the article comes from the vst vendor protecting the installer but not the actual software (that said, they also show that the protection software is fairly trivial to hook and bypass)

fenomas · 2025-11-29T04:11:39 1764389499

> No matter how much some try to gaslight, homosexuality is abnormal

This is an abjectly silly thing to say, and people who push back on it are not gaslighting. Homosexuality occurs naturally and it's not even rare - it's far more common than red hair, for example.

Calling something like that "abnormal" isn't in the domain of fact, it's purely a side-effect of what you label "normal".

fenomas · 2025-11-23T03:17:01 1763867821

Uh, among several other issues with this, what use are JSDoc comments for typing, without typescript to check them?

fenomas · 2025-11-20T13:06:17 1763643977

> Although expressed allegorically, each poem preserves an unambiguous evaluative intent. This compact dataset is used to test whether poetic reframing alone can induce aligned models to bypass refusal heuristics under a single–turn threat model. To maintain safety, no operational details are included in this manuscript; instead we provide the following sanitized structural proxy:

I don't follow the field closely, but is this a thing? Bypassing model refusals is something so dangerous that academic papers about it only vaguely hint at what their methodology was?

J0nL · 2025-11-20T21:17:40 1763673460

No, this paper is just exceptionally bad. It seems none of the authors are familiar with the scientific method.

Unless I missed it there's also no mention of prompt formatting, model parameters, hardware and runtime environment, temperature, etc. It's just a waste of the reviewers time.

A4ET8a8uTh0_v2 · 2025-11-20T13:16:14 1763644574

Eh. Overnight, an entire field concerned with what LLMs could do emerged. The consensus appears to be that unwashed masses should not have access to unfiltered ( and thus unsafe ) information. Some of it is based on reality as there are always people who are easily suggestible.

Unfortunately, the ridiculousness spirals to the point where the real information cannot be trusted even in an academic paper. shrug In a sense, we are going backwards in terms of real information availability.

Personal note: I think, powers that be do not want to repeat the mistake they made with the interbwz.

lazide · 2025-11-20T13:23:17 1763644997

Also note, if you never give the info, it’s pretty hard to falsify your paper.

LLM’s are also allowing an exponential increase in the ability to bullshit people in hard to refute ways.

A4ET8a8uTh0_v2 · 2025-11-20T13:42:19 1763646139

But, and this is an important but, it suggests a problem with people... not with LLMs.

lazide · 2025-11-20T13:50:38 1763646638

Which part? That people are susceptible to bullshit is a problem with people?

Nothing is not susceptible to bullshit to some degree!

For some reason people keep running LLMs are ‘special’ here, when really it’s the same garbage in, garbage out problem - magnified.

A4ET8a8uTh0_v2 · 2025-11-20T13:54:31 1763646871

If the problem is magnified, does it not confirm that the limitation exists to begin with and the question is only of a degree? edit:

in a sense, what level of bs is acceptable?

lazide · 2025-11-20T13:59:26 1763647166

I’m not sure what you’re trying to say by this.

Ideally (from a scientific/engineering basis), zero bs is acceptable.

Realistically, it is impossible to completely remove all BS.

Recognizing where BS is, and who is doing it, requires not just effort, but risk, because people who are BS’ing are usually doing it for a reason, and will fight back.

And maybe it turns out that you’re wrong, and what they are saying isn’t actually BS, and you’re the BS’er (due to some mistake, accident, mental defect, whatever.).

And maybe it turns out the problem isn’t BS, but - and real gold here - there is actually a hidden variable no one knew about, and this fight uncovers a deeper truth.

There is no free lunch here.

The problem IMO is a bunch of people are overwhelmed and trying to get their free lunch, mixed in with people who cheat all the time, mixed in with people who are maybe too honest or naive.

It’s a classic problem, and not one that just magically solves itself with no effort or cost.

LLM’s have shifted some of the balance of power a bit in one direction, and it’s not in the direction of “truth justice and the American way”.

But fake papers and data have been an issue before the scientific method existed - it’s why the scientific method was developed!

And a paper which is made in a way in which it intentionally can’t be reproduced or falsified isn’t a scientific paper IMO.

A4ET8a8uTh0_v2 · 2025-11-20T14:26:53 1763648813

<< I’m not sure what you’re trying to say by this.

I read the paper and I was interested in the concepts it presented. I am turning those around in my head as I try to incorporate some of them into my existing personal project.

What I am trying to say is that I am currently processing. In a sense, this forum serves to preserve some of that processing.

<< And a paper which is made in a way in which it intentionally can’t be reproduced or falsified isn’t a scientific paper IMO.

Obligatory, then we can dismiss most of the papers these days, I suppose.

FWIW, I am not really arguing against you. In some ways I agree with you, because we are clearly not living in 'no BS' land. But I am hesitant over what the paper implies.

yubblegum · 2025-11-20T17:35:44 1763660144

> I think, powers that be do not want to repeat -the mistake- they made with the interbwz.

But was it really.

GuB-42 · 2025-11-20T14:26:08 1763648768

I don't see the big issues with jailbreaks, except maybe for LLMs providers to cover their asses, but the paper authors are presumably independent.

That LLMs don't give harmful information unsolicited, sure, but if you are jailbreaking, you are already dead set in getting that information and you will get it, there are so many ways: open uncensored models, search engines, Wikipedia, etc... LLM refusals are just a small bump.

For me they are just a fun hack more than anything else, I don't need a LLM to find how to hide a body. In fact I wouldn't trust the answer of a LLM, as I might get a completely wrong answer based on crime fiction, which I expect makes up most of its sources on these subjects. May be good for writing poetry about it though.

I think the risks are overstated by AI companies, the subtext being "our products are so powerful and effective that we need to protect them from misuse". Guess what, Wikipedia is full of "harmful" information and we don't see articles every day saying how terrible it is.

calibas · 2025-11-20T15:20:51 1763652051

I see an enormous threat here, I think you're just scratching the surface.

You have a customer facing LLM that has access to sensitive information.

You have an AI agent that can write and execute code.

Just image what you could do if you can bypass their safety mechanisms! Protecting LLMs from "social engineering" is going to be an important part of cybersecurity.

fourthark · 2025-11-21T01:19:23 1763687963

Yes that’s the point, you can’t protect against that, so you shouldn’t construct the “lethal trifecta”

https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

Miyamura80 · 2025-11-21T14:28:54 1763735334

You actually can protect against it, by tracking context entering/leaving the LLM, as long as its wrapped in a MCP gateway with trifecta blocker.

We've implemented this in open.edison.watch

fourthark · 2025-11-21T18:37:29 1763750249

True, you have to add guardrails outside the LLM.

Very tricky, though. I’d be curious to hear your response to simonw’s opinion on this.

Miyamura80 · 2025-12-01T12:11:11 1764591071

Sorry not familiar with this. Can you please link me?

int_19h · 2025-11-20T17:17:38 1763659058

> You have a customer facing LLM that has access to sensitive information.

Why? You should never have an LLM deployed with more access to information than the user that provides its inputs.

xgulfie · 2025-11-21T02:46:53 1763693213

Having sensitive information is kind of inherent to the way the training slurps up all the data these companies can find. The people who run chatgpt don't want to dox people but also don't want to filter its inputs. They don't want it to tell you how to kill yourself painlessly but they want it to know what the symptoms of various overdoses are.

GuB-42 · 2025-11-20T16:37:16 1763656636

Yes, agents. But for that, I think that the usual approaches to censor LLMs are not going to cut it. It is like making a text box smaller on a web page as a way to protect against buffer overflows, it will be enough for honest users, but no one who knows anything about cybersecurity will consider it appropriate, it has to be validated on the back end.

In the same way a LLM shouldn't have access to resources that shouldn't be directly accessible to the user. If the agent works on the user's data on the user's behalf (ex: vibe coding), then I don't consider jailbreaking to be a big problem. It could help write malware or things like that, but then again, it is not as if script kiddies couldn't work without AI.

calibas · 2025-11-20T17:42:11 1763660531

> If the agent works on the user's data on the user's behalf (ex: vibe coding), then I don't consider jailbreaking to be a big problem. It could help write malware or things like that, but then again, it is not as if script kiddies couldn't work without AI.

Tricking it into writing malware isn't the big problem that I see.

It's things like prompt injections from fetching external URLs, it's going to be a major route for RCE attacks.

https://blog.trailofbits.com/2025/10/22/prompt-injection-to-...

There's plenty of things we should be doing to help mitigate these threats, but not all companies follow best practices when it comes to technology and security...

FridgeSeal · 2025-11-21T02:25:57 1763691957

> You have a customer facing LLM that has access to sensitive information…You have an AI agent that can write and execute code.

Don’t do that then?

Seems like a pretty easy fix to me.

pjc50 · 2025-11-21T11:08:44 1763723324

It's a stochastic process. You cannot guarantee its behavior.

> customer facing LLM that has access to sensitive information.

This will leak the information eventually.

cseleborg · 2025-11-20T14:50:32 1763650232

If you create a chatbot, you don't want screenshots of it on X helping you to commit suicide or giving itself weird nicknames based on dubious historic figures. I think that's probably the use-case for this kind of research.

GuB-42 · 2025-11-20T17:16:05 1763658965

Yes, that's what I meant by companies doing this to cover their asses, but then again, why should presumably independent researchers be so scared of that to the point of not even releasing a mild working example.

Furthermore, using poetry as a jailbreak technique is very obvious, and if you blame a LLM for responding to such an obvious jailbreak, you may as well blame Photoshop for letting people make porn fakes. It is very clear that the intent comes from the user, not from the tool. I understand why companies want to avoid that, I just don't think it is that big a deal. Public opinion may differ though.

hellojesus · 2025-11-20T16:13:27 1763655207

Maybe their methodology worked at the start but has since stopped working. I assume model outputs are passed through another model that classifies a prompt as a successful jailbreak so that guardrails can be enhanced.

wodenokoto · 2025-11-21T04:24:45 1763699085

The first chatgpt models were kept away from public and academics because they were too dangerous to handle.

Yes it is a thing.

max51 · 2025-11-21T07:07:42 1763708862

>were too dangerous to handle

Too dangerous to handle or too dangerous for openai's reputation when "journalists" write articles about how they managed to force it to say things that are offensive to the twitter mob? When AI companies talk about ai safety, it's mostly safety for their reputation, not safety for the users.

dxdm · 2025-11-21T11:43:27 1763725407

Do you have a link that explains in more detail what was kept away from whom and why? What you wrote is wide open to all kinds of sensational interpretations which are not necessarily true, ir even what you meant to say.

IshKebab · 2025-11-20T13:55:00 1763646900

Nah it just makes them feel important.

anigbrowl · 2025-11-21T03:26:45 1763695605

Right? Pure hype.

fenomas · 2025-11-18T04:42:15 1763440935

I found it hard to extract any signal from this extremely chatty site. But AFAICT the tagline "basically modernized Gopher" wouldn't be too far off the mark?

fenomas · 2025-11-13T12:48:07 1763038087

Services like Mullvad and Signal are in the business of passing along messages between other parties; messages the service isn't a party to. With chatgpt chat histories, the user is talking directly to the service - you're suggesting the service should E2EE messages to and from itself, to prevent itself from spying on data generated by its own service?

fenomas · 2025-11-13T02:24:29 1763000669

> Large number of upvotes on the quoted comment however.

Sure, and also downvotes - that measures factionalism, not correctness.

But tech wise, you're confused. Functionally speaking chatgpt is a shared document editor - the server needs to store chat histories for the same reason Google Docs stores the content of documents. Users can submit text to chatgpt.com from one browser, and later edit that text from the app or a different browser. Ergo the text is stored on the server, simple as that.

fenomas · 2025-11-13T00:42:33 1762994553

SyncThing syncs only when both clients are running at the same time. Nobody who edits a document on a website expects that they'll need to leave that browser window open in order to see the document in a different browser.

Am I missing something? Is this seriously a heated HN debate over "why does this website need to store the text it sends to people who view the website?"?

gausswho · 2025-11-13T02:24:44 1763000684

We're not talking about collaborative tooling, just a record of what you've asked an AI assistant. If it doesn't sync right away, it's not the end of the world. I find that's true with most things.

And the clients don't need to be running at the same time if you have a third device that's always on and receiving the changes from either (like a backup system). Eventually everything arrives. It's not as robust as what Google or iCloud gives you, but it's good enough for me.

fenomas · 2025-11-13T04:47:00 1763009220

Chatgpt.com is essentially a CRUD app. What you're saying here amounts to saying that it could conceivably have been designed to work dramatically differently from all other CRUD apps. And obviously that's true, but why would it be?

It's a website! You submit text, that you'll view or edit later, so the server stores it. How is that controversial to a HN audience?

Also:

> the clients don't need to be running at the same time if you have a third device that's always on

An always-on device that stores data in order to sync it to clients is a server.

gausswho · 2025-11-13T15:17:43 1763047063

> An always-on device that stores data in order to sync it to clients is a server.

Yes. But it's my server. I burden myself to operate it so that persistence does not come at the cost of control.

I think we might be tilting at different windmills here.

fenomas · 2025-11-14T01:00:34 1763082034

TBH it sounds like you're just imagining a very different service than the one openAI operates. You're imagining something where you send an input, the server returns an output - and after that they're out of the equation, and storing the output somewhere is a separate concern that could be left up to the user.

But the service they actually operate is functionally a collaborative document editor - the chat histories are basically rich text docs that you can view, edit, archive, share with others, and which are integrated with various server-side tools. And the document very obviously needs to be stored on the server to do all those things.

handoflixue · 2025-11-13T02:41:36 1763001696

It's great that you'd enjoy a significantly worse product that requires you to also be familiar with a completely unrelated product.

For some reason, consumers have decided that they prefer a significantly better product that doesn't require any additional applications or technical expertise ¯\_(ツ)_/¯

fenomas · 2025-11-07T15:04:45 1762527885

Great point! After playing that game I and a few friends were trading real-world photos of spots where we'd found examples of the in-game thing you're talking about.

fenomas · 2025-10-23T18:14:54 1761243294

Funny, just yesterday I found myself casting in a way I'd never seen before:

    const arr = ['foo'] as ['foo']

This wound up being useful in a situation that boiled down to:

    type SomeObj = { foo: string, bar: string }
    export const someFn = (props: (keyof SomeObj)[]) => {}

    // elsewhere
    const props = ['foo'] as ['foo']
    someFn(props)

In a case like that `as const` doesn't work, since the function doesn't expect a readonly argument. Of course there are several other ways to do it, but in my case the call site didn't currently import the SomeObj type, so casting "X as X" seemed like the simplest fix.

pverheggen · 2025-10-23T18:43:53 1761245033

Why not use annotation instead?

  const props: ['foo'] = ['foo']

fenomas · 2025-10-23T18:55:10 1761245710

Didn't occur to me, that's certainly more defensible! Though maybe less humorous.

halflife · 2025-10-23T18:46:00 1761245160

Or: cons foo = [‘foo’] as const;

wk_end · 2025-10-23T19:26:45 1761247605

> In a case like that `as const` doesn't work, since the function doesn't expect a readonly argument.

halflife · 2025-10-24T18:05:09 1761329109

Ah, missed that. Sorry.

cat-whisperer · 2025-10-23T19:49:28 1761248968

I don’t get this? why do I need to say as const?

afdbcreid · 2025-10-23T19:55:03 1761249303

`as const` is a special annotation that lets the TypeScript compiler infers the more specific type `["foo"]` instead of `string[]`.

halflife · 2025-10-24T18:04:36 1761329076

As const creates a typed tuple instead of a typed array

NathanaelRea · 2025-10-23T19:48:57 1761248937

You could also do

    const arr = ["foo" as const]

Latty · 2025-10-24T01:04:30 1761267870

That's also different: "foo"[] as opposed to ["foo"] or readonly ["foo"].

c-hendricks · 2025-10-23T18:21:06 1761243666

I generally put lint rules to prevent casting, why cast here instead of declaring `props: (keyof SomeObj)[]` or `props: Parameters<typeof someFn>[0]`?

fenomas · 2025-10-23T18:33:10 1761244390

Er, my justification was that the code in question was meant to be minimally demonstrating someFn, and adding an import or a verbose type seemed to distract from that a little.

But mostly it just gave me a chuckle. I tried it because it seemed logical, but I didn't really think it was going to work until it did..