It's a system design oversight. Permanence was long sought for. I'm not claiming...

myfonj · on March 4, 2023

I am constantly pondering ideas how could intuitive and "cool" web without broken URLs look like, and whether mechanisms for it are or are not embraided into original standards.

Farthest I got is that we probably should see two addresses where we currently see one in our URL bar: Locator and Identifier and whole web-related technology should revolve around this distinction with immutability in mind.

- On server side Locators should always respond with locations of Identifiers or other Locators. So, redirects. Caching headers makes sense here, denoting e.g. "don't even ask next five minutes". - Content served under Identifier should be immutable. So "HTTP 200" response always contains same data. Caching headers here makes no sense at all, since the response will always be the same (or error).

In practice, navigating to https://news.ycombinator.com/ (locator) should always result in something like HTTP 302 to https://news.ycombinator.com/?<timestamp> or any other identifier denoting unique state of the response page. Any dynamic page engine should first provide mechanism producing identifiers of any resulting response.

I feel there are some fundamental problems in this utopian concept (especially around non-anonymous access), but nevertheless would like to know if it could be viable at least in some subset current/past web.

cxr · on March 5, 2023

PURLs have been a thing for a while. And while rel="canonical", on the other hand, is a frighteningly recent invention relatively speaking, it does exist.

snowwrestler · on March 4, 2023

I think this is being handled the right way now by the Internet Archive. Sort of like a library that keeps old copies of newspapers around on microfiche, or a museum that has samples of bugs in Indonesia 100 years ago. They have a dedicated mission to preserve, around which supportive people can organize effort and money.

I don’t think this can be solved by decentralized protocols. A lot of folks just won’t put in the effort. Quite a few companies already actively delete old content; there’s no way they are going to opt into web server software that prevents that.

kristopolous · on March 5, 2023

That's just a function of expectations.

Expectations are set, not interrogated. Let me give you an example

Companies and organizations with domains are expected to also be running mail on that domain.

Why? I can sit around and make up a bunch of reasons but none of them are given when that mail service is being set up, it's done out of expectation, just like how someone might pay $295,000.00 for the .com they want and wouldn't even pay $2.95 for the .me or .us

Are the .com keys closer together? Easier to type? Supported by more browsers? No.

There's mostly arbitrary social norms that get institutionalized.

They can go away. Having ftp service or a fax line, for instance, used to be one of them. Those weren't thrown into the trash for cost cutting reasons, the norms changed.

The question is where do we want these norms to go and what are we doing to encourage it?

This is how this could materialize - say there's an optional archival fee when registering a domain. Next search engines could prioritize domains that pay this fee under the logic that by doing it, the website owners are standing behind what they publish.

These types of schemes are pretty easy to fabricate - the point is the solutions are plentiful, it's all a matter of focus, effort and intentions.