More

notachatbot1234 · on Aug 22, 2024

> The dataset is released under the Creative Commons BY-SA

How can this be legal? All imagery is taken from (usually) non-free movie trailers.

mewse-hn · on Aug 22, 2024

I had the same question, it seems silly to build a collection of copyrighted content and apply your own copyright to it.

I guess the argument is the same one all the AI people are relying on: I built this collection of fair use material and I am applying my copyright to the product of my work. I wouldn't want to argue that one in court.

b112 · on Aug 22, 2024

Yet such court actions are going to happen again and again, until VC backed AI firms either all go bankrupt, or win.

RIMR · on Aug 22, 2024

This dataset is already at risk of existing outside of fair use, but trying to apply your own copyright is pretty much asking to get sued.

cardiffspaceman · on Aug 23, 2024

IANAL: As far as I know, making the compilation earns you a copyright. But for a reader to make a copy, they need licenses from you and from the copyright holders of all the images. So in this case maybe the release notice that you quoted means that there is still all the image copyrights to obtain licenses for.

dinobones · on Aug 22, 2024

IANAL, but https://en.wikipedia.org/wiki/Fair_use is a thing. Not sure if it applies in this case though.

LogicalRisk · on Aug 22, 2024

Fair use is a justification for why copyright restrictions may not apply in a given scenario, not a license to apply new legal restrictions to work you do not own.

carom · on Aug 23, 2024

Datasets and curations are copyrightable. I would think of it as a right to use the curation, not a right to use the actors' likeness.

notachatbot1234 · on Aug 19, 2024

A webpage that uses numeric identifiers for external references that are found only when scrolling to the very bottom of the page and show their URLs as plain text. Now that is a train wreck.

Hyperlinks are the cornerstone of the web. Don't be afraid of using them!

benrutter · on Aug 19, 2024

I think it's an intentional aesthetic choice[0].

Hyperlinks would be convenient, but something about the raw text / ascii art vibe makes me happy everytime I read a blog post from j3s even if it doesn't have the conveniences of the modern web.

[0] https://j3s.sh/about.html

adrianN · on Aug 19, 2024

Links are a convenience of the ancient web.

benrutter · on Aug 19, 2024

Haha, ok that's fair enough! I guess what I meant was something like "conveniences that are on most other contemporary websites"

Xenoamorphous · on Aug 19, 2024

Hyperlinks are not modern web though.

SSLy · on Aug 19, 2024

They in fact predate the modern or even ancient WWW.

hinkley · on Aug 19, 2024

What?

atoav · on Aug 19, 2024

You mean footnotes? As they have been used for centuries in print?

The difference between them and a simple hyperlink is that they can and often will provide some additional context, that is out of the scope of the original text. Ideally on a website meant for computer screens you wouldn't have them on the end, but in the margins, next to the information, but for short stuff it is okay to put them at the end of the chapter – bonus points if the reference numbers can be clicked and take you to the foot note, extra bonus points if there is an arrow taking you up again.

But this is scientific literature style writing, not everything needs footnotes.

Someone · on Aug 19, 2024

> The difference between them and a simple hyperlink is that they can and often will provide some additional context,

  <a href=“Foo” title=“go to Foo” />

will give you additional context on hover (on systems that support that)

hk__2 · on Aug 19, 2024

> will give you additional context on hover (on systems that support that)

"hover" has no meaning on touch-based interfaces.

c22 · on Aug 19, 2024

On my touch-based device a long-press seems to work the same as a hover.

majewsky · on Aug 19, 2024

You should see the title text if you long-press on the link, no?

hk__2 · on Aug 19, 2024

On iOS it opens the link in an pop-in.

tempfile · on Aug 19, 2024

That sounds even better?

atoav · on Aug 19, 2024

Good to use I guess, but you can't rely on that with the likes of smartphones and tablets.

efilife · on Aug 19, 2024

> The difference between them and a simple hyperlink is that they can and often will provide some additional context

It's possible to 'link' to a html tag, so the page jumps to the bottom, where the additional context is, much like wikipedia does

atoav · on Aug 19, 2024

Yeah, if you read my post again, you will find that I mentioned this already.

efilife · on Aug 19, 2024

Right, sorry. My only defense is that I just woke up half-dead

atoav · on Aug 20, 2024

It happens to the best of us : )

creesch · on Aug 19, 2024

Also using a monospaced font for both the written text and command line output is certainly a choice. I get that it is often an aesthetic choice, but given that a blog post is written with the idea to be read, one I don't think is a particularly good one. Although the last time I made a remark about that on HN it became clear to me that a lot of people don't see the issue. Even if there are decades worth (at this point) of research that makes it clear that a sans serif font (or even a serif font on modern displays) works better for readability. ¯\_(ツ)_/¯

It is clear that the author is very explicitly going for the aesthetics of a terminal, given that all formatting of the text is ASCII based down to the line length being hard coded as if we are dealing with a hard limit of columns.

Personally, I'd prefer something more like this: https://www.creesch.com/dump/img/img_66c3127604542.png.

badcppdev · on Aug 19, 2024

Agreed. And having the link at the bottom as https://archive.is/XYABC without information about the link to also a questionable choice.

gary_0 · on Aug 19, 2024

I did like the little CSS animation, though. The fish bounces without JavaScript!

notachatbot1234 · on Aug 13, 2024

What does that cost?

dllu · on Aug 14, 2024

About $2000 I think, I bought it in 2017.

notachatbot1234 · on Aug 12, 2024

> Shameless promotion: rad is an auto D.J. that actually does follow your preferences https://rad.fm/

Site tried to detect my location. That's creepy and invasive. Tab closed instantly.

pjmq · on Aug 12, 2024

Rad.FM maker here. Thanks for at least trying it. A) the web app is still in alpha. B) Rad needs your location to make what it says to the listener relevant to where you are and your current time. Also, news & weather need your location.

Thanks for the feedback though, I'll delay the location request to later in the flow + add a popover which explains why it's needed when we get to beta.

notachatbot1234 · on July 24, 2024

Watching videos on phones, which "natively" have a vertical orientation, is pretty popular. I expect the majority of videos watched this way.

notachatbot1234 · on July 24, 2024

The subject is in a vertical orientation, so it is perfect and desirable that the original video has all its resolution dedicated to capturing the phenomenon in the best quality possible. A horizontal video would mean that there are less pixels on the subject matter.

notachatbot1234 · on July 17, 2024

This ticks 9/10 boxes on my detector for typical LLM generated SEO content spam. :\

setalp · on July 17, 2024

OP here.

Honestly, I got similar feedback when I got this reviewed internally. At this point I am not sure how to write so that it doesn't seem LLM generated.

Would be helpful if you could share why you thought this was LLM generated. The suggestions I have gotten so far has been to remove bullet points and sections - which I feel breaks readability.

mberlove · on July 17, 2024

I don't think it's so bad, but if I had to guess, it's from the division / breakdown of sections and lists, which reads a lot like the formulaic approach you get from an LLM (which is not necessarily bad, just common in the output). E.g. "Docker and Docker Compose can simplify the process of installing and managing services. They allow you to:" etc etc. This may sound like an LLM covering all its bases rather than a human explaining subject matter.

That's just my take, again I don't think it's that bad. The article would be a useful breakdown for beginners.

(Also, I'm sure you know, LLM content sounds that way because the LLM was trained on content just like this, so it's not really surprising that a guide generated by an LLM would sound like the kind of guide that was used to train an LLM...)

wonger_ · on July 17, 2024

Not parent commenter, but I've been trying to verbalize why it feels LLM-like.

- h2 titles feel as basic as possible, just "what self-hosting, who self-hosting, why self-hosting, ..."

- SEO spam often overuses keywords; on this page, it feels like "self-hosting" is used a bit too often, even if it's well-intentioned

- the text ends in a classic LLM warning "remember to be careful"

- predictable sentence patterns

Some of these things are good for readability. I guess this article feels a bit too plain? I think tech company blog posts add a unique style and voice these days, because otherwise they'll blend in with the average SEO/LLM content.

Also editing nits:

    > self hosing
    > Self Hosting 
    > atleast

Good self-hosting tips, though. Thanks for sharing.

setalp · on July 17, 2024

Thanks. This is really helpful.

The overuse of "Self Hosting" is fair. Better H2 titles would have made it less frequent. Will be more thoughtful about this the next time.

The unique style and voice is where I am struggling with. Have always been instructed to write in a plain tone and simple English so that its easier to read through.

MzHN · on July 17, 2024

I tried reading the article with the GP's comment in mind. For most of the sections it didn't feel like there was anything that would flag it as LLM generated for me.

But when I got to "How to Start Self-Hosting?", which is the section I was most interested in, I got a strong sense of déjà vu.

Reading this section felt exactly like I feel when I hit a bad prompt on ChatGPT. I feel I'm being given a huge dump of keywords but nothing that lets me make any progress. Reading it I felt the same frustration I do with ChatGPT as I have to prompt it again with "Can you elaborate on bullet point 6" to get anything useful out of it.

With ChatGPT the reason is usually a prompt that was either too broad/open-ended or a difficult topic for ChatGPT to answer. And it has a tight limit on how long the answer can be, which is understandable. For an article though it feels a bit jarring and there is no immediate way to ask for details.

I think the rest of the article is fine really. Sure the word of caution is exactly what LLMs do but unlike LLMs, which usually state the obvious, it has a lot of useful information.

_xivi · on July 17, 2024

Normally I'd not pay too much attention to these comments but the assessment here is spot on. I'd say LLMs articles in general are:

  1- Always longer than necessary with a lot of fluff

  2- Favor lists and hierarchy

Because they're trained on mostly SEO spam and buzzfeed-style articles

I asked Gemini to "write an article about self hosting" and the output structure and content is eerily similar

Here is a side by side comparison: https://i.postimg.cc/kXXpWgnZ/why-it-look-like-LLM-generated...

notachatbot1234 · on July 17, 2024

Are those really the same?

- "commodity exports" -> "raw materials"?

- "hard currency" -> "foreign currency"?

corimaith · on July 17, 2024

This does illustrate a problem when talking about complex topics or mechanisms is the need for specificity. Using short, simple sentences comes at the risk of making things seem overly vague and hand wavey, or worse, misrepresent the concept.

In continental philosophy or mathematical papers this gets all too apparent, as alot of argument hinge on very fine differences and nuances that need to specified else people get the wrong idea.

notachatbot1234 · on July 15, 2024

Is this criticism still relevant? 2011 is 13 years (as in: a teenager's life) ago.

aitchnyu · on July 15, 2024

FWIW most discussion about inproc is around 2010, per $SEARCHENGINE. Guess that architecture didnt catch on.

notachatbot1234 · on July 12, 2024

> The prompt comprehension is incredible! #auraflow

> "a cat that is half orange tabby and half black, split down the middle. Holding a martini glass with a ball of yarn in it. He has a monocle on his left eye, and a blue top hat, art nouveau style "

Plus an image that somewhat resembles that prompt. The cat has a human-like hand with a chopped off thumb and 6 fingers in total, differently colored eyes, a branch in front of its face, the ball of yarn is somehow floating in mid-air.]

viraptor · on July 12, 2024

These are somewhat valid issues. But given the currently available open models, this is a massive improvement. The human-like hand and changing the styles on the sides of the head isn't even bad - those are valid artistic choices you'd see on similar illustrations - they're just badly executed here.

Kiro · on July 12, 2024

Somewhat resembles? Come on.