Just a warning, the license [1] specifically blocks EU use:
> 3. Conditions for License Grant. You represent and warrant that You will not, access, download, install, run, deploy, integrate, modify, or otherwise use the Model, directly or indirectly, within the European Union.
Even so, why would the licensor put it in and force it through a license. It's on the licensee to check the laws and regulations they themselves operate in.
The EU AI Act is supposed to affect all AI "providers", which includes any "natural or legal person, public authority, agency or other body that develops an AI system or a general-purpose AI model or that has an AI system or a general-purpose AI model developed and places it on the market or puts the AI system into service under its own name or trademark, whether for payment or free of charge" [0].
This would plausibly include anyone developing an LLM, even if they aren't selling access to it or building applications based on it. There are several exemptions, and the Act obstensibly avoids creating burdens for most general-purpose LLMs, but the point is that Huawei wants to avoid any worry by not "plac[ing] it on the market" in the first place.
I don’t agree. Tools like DeepL were and still are better than Google Translate long before chat bots became a thing. The French-made Mistral AI is pretty decent as well.
FWIW, I refactored 500+ Junit 4 to Junit 5 tests with locally running Mistral 8B on an M3 MBP. It worked flawlessly, but surely I cannot attest for other use cases.
And who thinks that, for even a second, that an European (in this case) will not download, install, and try to run this just because the LICENSE says you can't?
FYI, this is not intended to be offensive to Europeans, I am European myself. That is not the point. The point is, who gives a damn about the LICENSE in reality, on their PERSONAL computer? Serious question.
The licence is not there for enforcement from their side. It's a legal protection for Huawei. Essentially "We told you it's not for the EU. If you get sued don't try to put it on us."
Also any company of a serious size will have lawyers interested in licences of everything you're running.
they might license it to companies in the US, but don't want to have to deal with the changes and bureucracy needed to support individuals.
The statement's purpose is to say the equivalent "if you're a European and do run it, it's on you, this is not a product we release or support for the European market, don't expert support, liability, etc".
I'm really torn on the whole thing. I consider myself a patriotic American and would never do anything to undermine the security of my country or its allies (using the same definition of national security that the serious sworn oaths use, "all enemies foreign and domestic", which makes NSA backdoors that compromise American devices squarely a "domestic enemy").
But loyalties don't change facts and China is where serious hackers are rising on merit, doing a lot with limited resourves, giving zero fucks about empty slick talk.
If we wanted to hobble the PRC's technical rise we should have subsidized wasteful NVIDIA use and had Altman/YC be in charge: they'd still be gladhanding about how to pump their portfolio companies sticker price and avoid "systemic shocks" to the stock market anchored on NVDA.
What? I do not find anything confusing. You live in a Marvel world if you think a LICENSE is going to stop people from using a product. But like you said, it is not intended to be for enforcement purposes, but Huawei is trying to save its own ass.
So what is your answer? Mostly companies only? That is a fair answer, but you are the one who said this:
> You'll be both breaking their licence and potentially your local European data laws.
Again, who cares, dude? Companies might, but individuals probably give a rat's ass. So why leave that comment?
And just for the record, if you quote someone, quote them verbatim, otherwise it is not a quote.
For those that would not remember, this was a real thing in the late 80s and 90s relating the cryptography.
There were serious laws limiting the export a "modern" cryptography software from the USA.
Some of us had to face up to the serious challenge of connecting to an FTP server and downloading PGP and risking violating US law to download a software package.
A few years later we had to decide "Do you want the secure Netscape, or the insecure Netscape?".
A lot of companies and research institutes in the EU would like to be able to use a locally hosted LLM for their employees so they don't have to worry what data they give away.
Also it is not rational for any individual to buy the hardware for running a serious LLM and then let it idle 99.9% of the day.
I am not up to date with the models, but I have heard good stories about a couple of open source models. You should ask Simon Willis. I hope he will be summoned (@simonw).
A lot of companies and research institutes in the EU would like to be able to use a locally hosted LLM for their employees so they don't have to worry what data they give away.
They will certainly not violate EU laws and also probably not the licence.
It's plausible deniability. Someone at Huawei presumably thinks there's a chance that exporting this to Europe might be a legal problem at some point in the future. So they added a restriction, enough for plausible deniability.
It's not exactly "plausible deniability" in the common sense of the term.
It's not supposed to make them appear as plausibly denying that some European can download and use this.
It's role is to signal that if someone does, it's on him, not them, and he wont have any support, liability claims, etc as if they could if it was a product intended for their use.
GDPR is not the issue here, the new AI act is. Since this is an open-weight release it is not bound by the training data disclosure rules, but it probably didn't go through the evaluation that is required above a certain number of FLOPs. That's why many recent big player model releases had a staggered release in the EU.
Picture your PC as a cheery little planet in the EU’s cosmic backwater, sipping a digital Pan-Galactic Gargle Blaster. You download Pangu Pro MoE, hit “run,” and expect to chat with an AI wiser than Deep Thought. Instead, you’ve hailed a Vogon Demolition Fleet. Your machine starts moaning like Marvin with a hangover, your screen spews gibberish that could pass for Vogon poetry, and your poor rig might implode faster than Earth making way for a hyperspace bypass.
The fallout? This AI’s sneakier than a two-headed president—it could snitch to its creators quicker than you can say “Don’t Panic.” If they spot your EU coordinates, you’re in for a galactic stink-eye, with your setup potentially bricked or your data hitchhiking to a dodgy server at the edge of the galaxy. Worse, if the code’s got a nasty streak, your PC could end up a smoking crater, reciting bad poetry in binary.
To translate for those not familiar with the writings of Douglas Adams:
nord is suggesting it's possible that the physical computer running this model could be used as a "hub" for potential spyware, or be overloaded with workloads that are not related to the actual task of running the model (and instead may be some form of malware performing other computational tasks). It could potentially perform data exfiltration, or act discriminatorily based on your percieved location (such as if you're located within the EU). At worst, data loss or firmware corruption/infection may be of concern in case of license violation.
I'm not sure I would outright disagree that this as possible, but with some caveats. I would think the reason that the license stipulates that usage within the EU is forbidden due to the EU AI Act (here is a resource to read through it: https://artificialintelligenceact.eu/ai-act-explorer/).
how will the "open weights" know that the pc is running within EU?
again, you are not talking about software that actually runs in your pc but the file that the software reads and loads into memory for its own use.
No it's actually worse. Approximately three seconds after you install the model in offline mode on your computer, a small detector van will come and park outside your door with an antenna on the roof, and relay your position to a Chinese ICBM for immediate targeting.
Sorry, sounds like total bullshit. The weights aren't going to do anything. And if you are worried about the code, with current deployment practices of curl | sudo bash there are much more low-hanging fruits out there. That's not even mentioning the possibility of running the model on a PC without internet access (no matter how good the new Chinese AI is, it's still not good enough yet to convince you to let it out of the box).
Don't give it mcp then (and I struggle to understand why would anyone give a stochastic model such access even if it is trained on very American NSA-certified hardware approved by Sam Altman himself).
The same thing breaking any license does. If you do it in your basement, nothing by definition. If you incorporate it in a service or distribute it as part of a project, well then you're on the hook. (and that is what license holders tend to care about)
I called him out in another thread. It makes absolutely no sense. He is talking against himself, judging by his comments.
To answer your question, he modified my comment (see the parentheses):
"> The point is, who gives a damn about (doing an illegal thing) in reality, on their (private property where nobody is likely to see that)?"
So... at best what he said is purely theoretical. He admitted it himself: "nobody is likely to see that". Though I am not sure I agree with it, but then again, in reality, no one gives a fuck, at least not in Europe.
There are likely multiple potential issues here, but one specific example: Processing and storage of PII without consent/authorisation is not allowed, regardless of whether you do it yourself or for others. And you can't guarantee that this model does not contain private information hoovered up by accident.
There is not a single AI model that fully complies with GDPR. How can you inform everyone, even those not named by actual name but otherwise identifiable, that their data is being processed and give them the ability to object when the data they train on isn’t public.
Literally the same for all other open weights, this is just legal ass covering where most others don’t even do that.
It isn't acknowledging. It is just a legalese to wash their hands away from following whatever EU restrictions and requirements may be applicable here otherwise.
I would argue it is possible to run a business and be sustainable on open source, it's just harder and is not so compatible with the growth that many want.
I don't have an issue with this kind of license being used where open source does not suit, but I don't think we should change/widen the definition of "open source" to suit the sustainability needs of those that open source isn't compatible with, at the impact of the freedoms and open rights it provides.
The problem is that if you're not already differentiably the best at hosting your service right when you launch, someone else that's better at hosting can just do it and take all your business.
And hosting while keeping your prices down is not just a whole different skill set, anyone that's already a big will have pricing deals with AWS so they will beat you even if you host in the exact same way.
It's probably less differentiable in the case of something like Gumroad which is less likely to have big scaling problems, but for things like a distributed database, you run a serious risk of someone who is paying AWS half of what you are per compute hour just deploy the Helm chart and undercut you completely.
I've been sustaining myself for a couple of years now on my open source project (BookStack). Still going in a positive direction.
Other than that, some that come to mind: Proxmox, Opnsense, SnipeIT, GitLab, Canonical, Codeweavers/wine, Plausible, Home-assistant/open-home-foundation/NabuCasa, FreeBSD Foundation, Laravel, Blender, Godot.
Within there is a whole mix of business plans, some offer hardware, some are open core, some offer related paid services, some offer hosting, some offer support etc...
Additionally, this submission's title was changed from "Gumroad Did Not Become Open Source Today" to "Gumroad’s license wouldn’t meet the widely regarded definition of open source"
I'm not sure, maybe they don't want to take a hard stance on the issue either way (to indicate how open source is defined on HN). Dang has been receptive to input and updated (what I believed to be) misrepresenting "open source" titles in the past though.
This post also seemed to be thrown off the front page for some reason.
It's customary on HN to avoid a repetition of a topic that's already being actively discussed. The original post is still on the front page and the licensing issue is being heavily discussed there. I've linked to your post from that thread.
Sure, but the title change is strange. I'd even say changing the meaning makes it outrage bait in the first place, because now HN has taken a stance, where before it was just the article author's opinion.
The title we changed it to is a verbatim sentence from the post (which is what we always try to do when we change a title), and thus is also/still the article author's opinion.
But with this title, the discussion can be drawn towards the question of whether the license would or wouldn’t meet the widely regarded definition of open source, rather than whether it meets each commenter’s own definition of open source, which would be a much less gratifying discussion.
NB: A subtitle is one option HN prefers, but there are other options, including lifting a line from the article itself. "Always" is too strong a qualifier.
(I've frequently suggested title changes to HN's mod team, I'm pretty familiar with this.)
The license used [1] would mean this very much wouldn't be widely considered open source, since the license sets limits on use and does not seem to provide open modification nor distribution.
I don't think it's even source-available? The repo has docs, a bunch of Lua scripts (for what software?), a small PHP module and a compiled "geo-ip firewall" binary. Most of the features mentioned on the Github page appear to only be in the paid version of the software, and this limited "free" version is delivered as a mystery-meat Docker image pulled from Huawei Cloud.
At best this is an advertisement that lies about being open source.
This is partly open sourced, not fully. All the rules are open sourced. Because the docker mirrors downloading from Huawei Cloud is faster, so we use it.
Your readme states "MIT License - See LICENSE file for details" but there is no such license file. I've been seeing this a lot lately, did you use an LLM to generate this part of the readme? If so, was MIT a concious choice of yours?
I looked at the page for my software you provide (at https://octabyte.io/fully-managed-open-source-services/appli...) and noticed the "BookStack Website" button leads to a kind-of iframe version of the site which also breaks links (and maybe other functionality) on the website. This is kind of sketchy IMO, since your not leading users to the actual project site, but some kind of modified/framed version. Please don't do this, just lead people to the site if that what the buttons says instead of a modified and potentially broken version of it.
I also see a lot of applications that are not generally considered open source listed on this service, including those with licensing considerations in regards to providing a hosted (or assisted hosting) service; Like Hashicorp Vault for example. There's also some which, unless you're manually patching the underlying projects, you'd technically being using their code which is under a non-open/enterprise-agreement license (like those I've included on my blogpost here: https://danb.me/blog/poisoned-source/)
Thanks for sharing my project! I started building BookStack just over 9 years ago to suit a need at work, and have been improving & maintaining it since. I left full time employment three years ago and have been focusing on BookStack since, with my living costs now covered via project donations, sponsorships & support services, and the growth of these continue as shown in my blogpost here: https://www.bookstackapp.com/blog/9-years-of-bookstack/#fina....
The platform has been designed for ease-of-use, with mixed-technical-skill workplace use in mind. The design and content structure is (purposefully) quite opinionated though so does not suit all use-cases, but for many it works quite well.
Technically it's built as quite a technically simple PHP/Laravel/MySQL stack with custom JavaScript sprinkled in where needed. The default WYSIWYG editor is TinyMCE based, although due to TinyMCE license changes I'm currently building a lexical-fork-based new editor.
The software is platform abstract, I've ran it on Debian & Ubuntu, RHEL & Fedora, Arch, OpenBSD and Windows systems. I stay out of system specific packaging methods though to avoid the extra maintenance burden, although community offers do form to enable this in some cases (like with BookStack in Arch's AUR).
Just an advisory so folks don't get caught out, as far as I can tell the offering advertised as the "open-source version" defaults to [1] (and heavily makes use of) the Elastic license. More context here [2].
[1] https://isitreallyfoss.com/projects/dokploy/
[2] https://github.com/Dokploy/dokploy/discussions/3