More

astromaniak · on Aug 18, 2024

The article misses multi-modal thing. Which is the future. Sure they can be considered a separate things, like today. But that's probably not the best approach. Support from framework may include partial training, easy components swap, intermediate data caching, dynamic architecture, automatic work balance and scaling.

astromaniak · on Aug 16, 2024

For some reasons LLMs get a lot of attention. But.. while simplicity is great it has limits. To make model reason you have to put it in a loop with fallbacks. It has to try possibilities and fallback from false branches. Which can be done on a level higher. This can be either algorithm, another model, or another thread in the same model. To some degree it can be done by prompting in the same thread. Like asking LLMs to first print high level algorithm and then do it step by step.

layer8 · on Aug 16, 2024

Iteration is important, but I don’t think that it can substantively compensate for the abstraction limitations outlined in the GP comment.

Salgat · on Aug 17, 2024

LLMs already do this. Their many wide layers allow for this, and as a final fallback, their output adjusts based on every token they generate (it's not all decided at once). All your statement really means is a vague "well it should do it more!" which yeah, is the goal of each iteration of GPT etc.

llm_trw · on Aug 16, 2024

LLMs get a lot of attention because they were the first architecture that could scale to a trillion parameters while still improving with every added parameter.

refulgentis · on Aug 16, 2024

> To make model reason you have to put it in a loop with fallbacks

Source? TFA, i.e. the thing we're commenting on, tried to, and seems to, show the opposite

astromaniak · on Aug 17, 2024

When the task, or part of it, is np complete there is no way around. Model has to try all options till it find working one. In a loop. And this can be multi-step with partial fallback. That's how humans are thinking. They can see only to some depth. They may first determine promising directions. Select one, go dipper. Fallback if it doesn't work. Pattern matching mentioned is simplest one step solution. LLMs are doing it with no problems.

cma · on Aug 17, 2024

The comment may just be pattern matching on the topic, not reasoning about TFA.

astromaniak · on Aug 10, 2024

This is good for datacenters, but.. NVidia stopped doing anything for consumers market.

astromaniak · on Aug 4, 2024

-> pc: to have wireless tablet, touchpad, joystick, etc.

pc ->: to have wireless robot control, printer, other devices which have drivers, but aren't easily programmable.

astromaniak · on Aug 4, 2024

FLUX image generator just came out of Black Forest Labs. They are working on video. So, you are right, this will become another battlefield soon.

astromaniak · on Aug 4, 2024

It's Copilot here, which is MS+OAI. But it's good that we have a healthy competition.

astromaniak · on Aug 4, 2024

It all depends on how far back we want to go to find the 'true' owners. Is second generation living there enough. How about third not living there? There are parallels here. Jewish are willing to look as far as two millennials back. Most americans don't get even close to Columbus days. Anyway, the 'facts on the ground' are the facts.

astromaniak · on Aug 4, 2024

So far no reasons to think so. The article lacks details. But for complex life to emerge in small pocket that oasis would have to exist for many millions of years. No proof of that. Then organism is not a bunch of single cellars in one place. This is another question here. Next there are intermediate forms of life which bundle together only for some time. This thing looks like single organism while it isn't.

astromaniak · on Aug 1, 2024

> it is not open source

It would be nice here if you give some examples of what you call open source model. Please ;) Because the impression is that these things do not exist, it's just a dream which does not deserve such a nice term..

Hizonner · on Aug 1, 2024

As far as I know, none have been released. And it doesn't even really make sense, because, as I said, the models aren't copyrightable to begin with and therefore aren't licensable either.

However, plenty of open source software exists. The fact that open source models don't exist doesn't excuse attempts to falsely claim the prestige of the phrase "open source".

kube-system · on Aug 2, 2024

> the models aren't copyrightable to begin with

What criteria for copyright protection are they missing?

astromaniak · on Aug 3, 2024

> As far as I know, none have been released.

I can tell you a secret. What you call 'open source' models are impossible. Because massive randomness is a part of training process. They are not reproducible. Having everything you cannot even tell if the given model was trained on the given dataset. Copyright is a different thing.

And a bad news, what's coming is even worst. Those will be the whole things with self awareness and personal experience. They can be copied, but not reproduced. More over, it's hard or almost impossible to detect if something undeclared was planted in their 'minds'.

All together means 'open source' model in strict interpretation is a myth, great idea which happen to be not. Like Turing test.

> However, plenty of open source software exists.

Attempt to switch topic detected.

PS: as for that massive downvote, I even wasn't rude, don't care. This account will be abandoned soon regardless, like all before and after.

jillesvangurp · on Aug 2, 2024

> models aren't copyrightable to begin with

You are wrong about that. It's a file with numbers. Which makes it a database or dataset and very much protected by copyright. That's why licenses are needed. For the phone book, things like open street maps, and indeed AI models.

> The fact that open source models don't exist

The fact that many people (myself included) routinely download and use models distributed under OSI approved licenses (Apache V2, MIT, etc.) makes that statement verifiably wrong. And yes, I do check the license of stuff that I use as I work with companies that care about such matters.

> As far as I know ...

Now you know better.

JimDabell · on Aug 2, 2024

> You are wrong about that. It's a file with numbers. Which makes it a database or dataset and very much protected by copyright. That's why licenses are needed. For the phone book, things like open street maps, and indeed AI models.

This is only true in jurisdictions that follow the sweat of the brow doctrine, where effort alone without creativity is considered enough for copyright. In other places, such as the USA, collections of facts are not copyrightable and a minimal amount of creativity is required for something to qualify as copyrightable. The phone book is an example that is often used, actually, to demonstrate the difference.

https://en.wikipedia.org/wiki/Sweat_of_the_brow

Hizonner · on Aug 2, 2024

> Which makes it a database or dataset and very much protected by copyright.

Not every collection of numbers is a database, and a database is not the same thing as a dataset.

Databases have limited copyright-like protection in some places. Under TRIPS, that extends to only databases that are "creative by virtue of the selection or arrangement of their contents" or something along those lines. In the US they talk specifically about curation.

ML models do not meet either requirement by any reasonable interpretation.

> The fact that many people (myself included) routinely download and use models distributed under OSI approved licenses (Apache V2, MIT, etc.) makes that statement verifiably wrong.

The "source code" of an ML model is most reasonably interpreted as including all of the training data, which are never, ever available.

Now you know better.

[On edit: By the way, the people creating these works had better hope they're outside copyright, because if not, each one of them is a derivative work of (at least some large and almost impossible to identify subset of) its training data, so they need licenses from all the copyright holders of that training material, which few of them have or can get.]

kube-system · on Aug 2, 2024

If we stop unnecessarily anthropomorphizing software, I think it is plainly obvious these are derivative works. You take the training material, run it through a piece of software, and it produces an output based on that input. Just because the black box in the middle is big and fancy doesn't mean that somehow the output isn't a result of the input.

However, transformativeness is a factor in whether or not there is a fair-use exception for the derivative work. And these models are highly transformative, so this is a strong argument for their fair-use.

Hizonner · on Aug 2, 2024

Maybe, but...

"Fair use" is pretty much entirely a US concept, and similar concepts in other countries aren't isomorphic to it.

The model does have a radically different form from its inputs. So you could easily imagine that being "transformative enough" for US fair use. A lot of the other fair use elements look pretty easy to apply, too. Although there's still the question of whether all the intermediate copies you made to create the model were fair use...

In fact, I'll even concede that a court could find that a model wasn't a derivative work of its inputs to begin with, and not even have to get to the fair use question. The argument would be that the model doesn't actually reproduce any of the creative elements of any particular training input.

I do think a finding like that would be a much bigger stretch than a finding that the model was copyrightable. I could easily see a world where the model was found derivative but was not found copyrightable. And it's actually not clear to me at all that the model has to be copyrightable to infringe the copyright in something else, so that's another mess.

Somewhat related, even if the model itself isn't infringing, it's definitely possible to have most models create outputs that are very similar to (some specific examples in) their training data... in ways that obviously aren't transformative. Outputs that might compete with the original training data and otherwise fail to be fair use. So even if the model is in the clear, users might still have to watch out.

simonw · on Aug 2, 2024

I'm personally comfortable calling a model "open source" if the license is compatible with the https://opensource.org/ definition.

The Llama models aren't. Some of the Mistral models are (the Apache 2 ones). Microsoft Phi-3 is - it's MIT.

dagaci · on Aug 2, 2024

Open source must include source material so that another can reproduce that the model. I would expect that to be a minimum.

simonw · on Aug 2, 2024

I agree, but that can't happen with the vast majority of these models because they're trained on unlicensed data so they can't slap an open source license on the training data and distribute it.

I've decided to draw my personal line at Open Source Initiative compliance for the license they release the model itself under.

I respect the opinion that it's not truly open source unless they release the training data as well, but I've decided not to make that part of my own personal litmus test here.

My reasoning is that knowing something is "open source" helps me decide what I legally can or cannot do with it when building my own software. Not having access to the training data downs affect my legal rights, it just affects my ability to recompile myself. And I don't have millions of dollars of GPUs so that isn't so important to me, personally.

Hizonner · on Aug 2, 2024

> that can't happen with the vast majority of these models because they're trained on unlicensed data

Tough beans? There's lots of actual software that can't be open source because it embeds stuff with incompatible restrictions, but nobody tries to redefine "open source" because of that.

... and, on a vaguely similar-flavored note, you'd better hope that the models you're using end up found to be noninfringing or fair use or something with respect to those "unlicensed data", because otherwise you're in a world of hurt. It's actually a lot easier to argue that the models aren't copyrightable than it is to argue that they're not derivative of the input.

> I've decided to draw my personal line at Open Source Initiative compliance for the license they release the model itself under.

You're allowed to draw your personal line about what you'll use anywhere you want, but that doesn't mean that you should try to redefine "open source" or support anybody who does.

astromaniak · on July 28, 2024

> Why are we blaming people who need medical care

We don't blame them, but without any control we have the highest prices and not so good overall results. Add to that Trump with his industry 'self regulation' idea and you know what's coming...