Hacker Newsnew | past | comments | ask | show | jobs | submit | notachatbot1234's commentslogin

> The dataset is released under the Creative Commons BY-SA

How can this be legal? All imagery is taken from (usually) non-free movie trailers.


I had the same question, it seems silly to build a collection of copyrighted content and apply your own copyright to it.

I guess the argument is the same one all the AI people are relying on: I built this collection of fair use material and I am applying my copyright to the product of my work. I wouldn't want to argue that one in court.


Yet such court actions are going to happen again and again, until VC backed AI firms either all go bankrupt, or win.


This dataset is already at risk of existing outside of fair use, but trying to apply your own copyright is pretty much asking to get sued.


IANAL: As far as I know, making the compilation earns you a copyright. But for a reader to make a copy, they need licenses from you and from the copyright holders of all the images. So in this case maybe the release notice that you quoted means that there is still all the image copyrights to obtain licenses for.


IANAL, but https://en.wikipedia.org/wiki/Fair_use is a thing. Not sure if it applies in this case though.


Fair use is a justification for why copyright restrictions may not apply in a given scenario, not a license to apply new legal restrictions to work you do not own.


Datasets and curations are copyrightable. I would think of it as a right to use the curation, not a right to use the actors' likeness.


A webpage that uses numeric identifiers for external references that are found only when scrolling to the very bottom of the page and show their URLs as plain text. Now that is a train wreck.

Hyperlinks are the cornerstone of the web. Don't be afraid of using them!


I think it's an intentional aesthetic choice[0].

Hyperlinks would be convenient, but something about the raw text / ascii art vibe makes me happy everytime I read a blog post from j3s even if it doesn't have the conveniences of the modern web.

[0] https://j3s.sh/about.html


Links are a convenience of the ancient web.


Haha, ok that's fair enough! I guess what I meant was something like "conveniences that are on most other contemporary websites"


Hyperlinks are not modern web though.


They in fact predate the modern or even ancient WWW.


What?


You mean footnotes? As they have been used for centuries in print?

The difference between them and a simple hyperlink is that they can and often will provide some additional context, that is out of the scope of the original text. Ideally on a website meant for computer screens you wouldn't have them on the end, but in the margins, next to the information, but for short stuff it is okay to put them at the end of the chapter – bonus points if the reference numbers can be clicked and take you to the foot note, extra bonus points if there is an arrow taking you up again.

But this is scientific literature style writing, not everything needs footnotes.


> The difference between them and a simple hyperlink is that they can and often will provide some additional context,

  <a href=“Foo” title=“go to Foo” />
will give you additional context on hover (on systems that support that)


> will give you additional context on hover (on systems that support that)

"hover" has no meaning on touch-based interfaces.


On my touch-based device a long-press seems to work the same as a hover.


You should see the title text if you long-press on the link, no?


On iOS it opens the link in an pop-in.


That sounds even better?


Good to use I guess, but you can't rely on that with the likes of smartphones and tablets.


> The difference between them and a simple hyperlink is that they can and often will provide some additional context

It's possible to 'link' to a html tag, so the page jumps to the bottom, where the additional context is, much like wikipedia does


Yeah, if you read my post again, you will find that I mentioned this already.


Right, sorry. My only defense is that I just woke up half-dead


It happens to the best of us : )


Also using a monospaced font for both the written text and command line output is certainly a choice. I get that it is often an aesthetic choice, but given that a blog post is written with the idea to be read, one I don't think is a particularly good one. Although the last time I made a remark about that on HN it became clear to me that a lot of people don't see the issue. Even if there are decades worth (at this point) of research that makes it clear that a sans serif font (or even a serif font on modern displays) works better for readability. ¯\_(ツ)_/¯

It is clear that the author is very explicitly going for the aesthetics of a terminal, given that all formatting of the text is ASCII based down to the line length being hard coded as if we are dealing with a hard limit of columns.

Personally, I'd prefer something more like this: https://www.creesch.com/dump/img/img_66c3127604542.png.


Agreed. And having the link at the bottom as https://archive.is/XYABC without information about the link to also a questionable choice.


I did like the little CSS animation, though. The fish bounces without JavaScript!


What does that cost?


About $2000 I think, I bought it in 2017.


> Shameless promotion: rad is an auto D.J. that actually does follow your preferences https://rad.fm/

Site tried to detect my location. That's creepy and invasive. Tab closed instantly.


Rad.FM maker here. Thanks for at least trying it. A) the web app is still in alpha. B) Rad needs your location to make what it says to the listener relevant to where you are and your current time. Also, news & weather need your location.

Thanks for the feedback though, I'll delay the location request to later in the flow + add a popover which explains why it's needed when we get to beta.


Watching videos on phones, which "natively" have a vertical orientation, is pretty popular. I expect the majority of videos watched this way.


The subject is in a vertical orientation, so it is perfect and desirable that the original video has all its resolution dedicated to capturing the phenomenon in the best quality possible. A horizontal video would mean that there are less pixels on the subject matter.


This ticks 9/10 boxes on my detector for typical LLM generated SEO content spam. :\


OP here.

Honestly, I got similar feedback when I got this reviewed internally. At this point I am not sure how to write so that it doesn't seem LLM generated.

Would be helpful if you could share why you thought this was LLM generated. The suggestions I have gotten so far has been to remove bullet points and sections - which I feel breaks readability.


I don't think it's so bad, but if I had to guess, it's from the division / breakdown of sections and lists, which reads a lot like the formulaic approach you get from an LLM (which is not necessarily bad, just common in the output). E.g. "Docker and Docker Compose can simplify the process of installing and managing services. They allow you to:" etc etc. This may sound like an LLM covering all its bases rather than a human explaining subject matter.

That's just my take, again I don't think it's that bad. The article would be a useful breakdown for beginners.

(Also, I'm sure you know, LLM content sounds that way because the LLM was trained on content just like this, so it's not really surprising that a guide generated by an LLM would sound like the kind of guide that was used to train an LLM...)


Not parent commenter, but I've been trying to verbalize why it feels LLM-like.

- h2 titles feel as basic as possible, just "what self-hosting, who self-hosting, why self-hosting, ..."

- SEO spam often overuses keywords; on this page, it feels like "self-hosting" is used a bit too often, even if it's well-intentioned

- the text ends in a classic LLM warning "remember to be careful"

- predictable sentence patterns

Some of these things are good for readability. I guess this article feels a bit too plain? I think tech company blog posts add a unique style and voice these days, because otherwise they'll blend in with the average SEO/LLM content.

Also editing nits:

    > self hosing
    > Self Hosting 
    > atleast
Good self-hosting tips, though. Thanks for sharing.


Thanks. This is really helpful.

The overuse of "Self Hosting" is fair. Better H2 titles would have made it less frequent. Will be more thoughtful about this the next time.

The unique style and voice is where I am struggling with. Have always been instructed to write in a plain tone and simple English so that its easier to read through.


I tried reading the article with the GP's comment in mind. For most of the sections it didn't feel like there was anything that would flag it as LLM generated for me.

But when I got to "How to Start Self-Hosting?", which is the section I was most interested in, I got a strong sense of déjà vu.

Reading this section felt exactly like I feel when I hit a bad prompt on ChatGPT. I feel I'm being given a huge dump of keywords but nothing that lets me make any progress. Reading it I felt the same frustration I do with ChatGPT as I have to prompt it again with "Can you elaborate on bullet point 6" to get anything useful out of it.

With ChatGPT the reason is usually a prompt that was either too broad/open-ended or a difficult topic for ChatGPT to answer. And it has a tight limit on how long the answer can be, which is understandable. For an article though it feels a bit jarring and there is no immediate way to ask for details.

I think the rest of the article is fine really. Sure the word of caution is exactly what LLMs do but unlike LLMs, which usually state the obvious, it has a lot of useful information.


Normally I'd not pay too much attention to these comments but the assessment here is spot on. I'd say LLMs articles in general are:

  1- Always longer than necessary with a lot of fluff

  2- Favor lists and hierarchy
Because they're trained on mostly SEO spam and buzzfeed-style articles

I asked Gemini to "write an article about self hosting" and the output structure and content is eerily similar

Here is a side by side comparison: https://i.postimg.cc/kXXpWgnZ/why-it-look-like-LLM-generated...


Are those really the same?

- "commodity exports" -> "raw materials"?

- "hard currency" -> "foreign currency"?


This does illustrate a problem when talking about complex topics or mechanisms is the need for specificity. Using short, simple sentences comes at the risk of making things seem overly vague and hand wavey, or worse, misrepresent the concept.

In continental philosophy or mathematical papers this gets all too apparent, as alot of argument hinge on very fine differences and nuances that need to specified else people get the wrong idea.


Is this criticism still relevant? 2011 is 13 years (as in: a teenager's life) ago.


FWIW most discussion about inproc is around 2010, per $SEARCHENGINE. Guess that architecture didnt catch on.


> The prompt comprehension is incredible! #auraflow

> "a cat that is half orange tabby and half black, split down the middle. Holding a martini glass with a ball of yarn in it. He has a monocle on his left eye, and a blue top hat, art nouveau style "

Plus an image that somewhat resembles that prompt. The cat has a human-like hand with a chopped off thumb and 6 fingers in total, differently colored eyes, a branch in front of its face, the ball of yarn is somehow floating in mid-air.]


These are somewhat valid issues. But given the currently available open models, this is a massive improvement. The human-like hand and changing the styles on the sides of the head isn't even bad - those are valid artistic choices you'd see on similar illustrations - they're just badly executed here.


Somewhat resembles? Come on.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: