Stable Diffusion Is the Most Important AI Art Model Ever

daenz · on Aug 28, 2022

>once they figure out how to control potentially harmful generations

Is it just me, or does anyone else think that this is an impossible and futile task? I don't have a solid grasp on what kind of censorship is possible with this technology, but the goal seems to be on par with making sure nobody says anything mean online. People are extremely creative and are going to find the prompts that generate the "harmful" images.

dkjaudyeqooe · on Aug 28, 2022

Reminds me of a toy girl doll I heard about which had a speech generator which you could program to say sentences but had "harmful" words removed, keeping only wholesome ones.

I immediately came up with "Call the football team, I'm wet" and "Daddy lets play hide the sausage" as example workarounds.

It's entirely pointless. Humans are vastly superior in their ability to subvert and corrupt. Even if you were able to catch regular "harmful" images humans would create a new categories of imagery which people would experience as "harmful", employ allusions, illusions, proxies, irony etc. It's endless.

canjobear · on Aug 28, 2022

Furthermore, the possibility that we create an AI that can outsmart humans in terms of filtering inappropriate content is even scarier. Do you really want a world with an AI censor of superhuman intelligence gatekeeping the means of content creation?

AndrewKemendo · on Aug 29, 2022

If you squint and view the modern corporation as a proxy for "an AI censor of superhuman intelligence gatekeeping the means of content creation" - then that's been happening for a long while now.

Automatic review of content, NSFW filters, SPAM filters etc... have been bog standard since the earliest days of the internet.

I don't think anyone likes it. Some fight it and create their own spaces that allow certain types of content. Most people accept it though and move on with their lives

vintermann · on Aug 29, 2022

I'm down with calling a corporation intelligent (as long as you don't call it a person). But automatic content review is regularly bypassed, they can't even keep very obvious spam off YouTube comments, such as comments copied from real users, posted with usernames like ClickMyChannelForXXX.

So if the corporation is an intelligent collective, then it's regularly outsmarted by other intelligent collectives determined to bypass it.

AndrewKemendo · on Aug 30, 2022

Agreed, I think that's a good way to put it. That other collectives kind of "nip" at the larger ones and cause them to implement defenses.

pverghese · on Aug 29, 2022

That's the government forcing corporations to do that

AndrewKemendo · on Aug 29, 2022

Normally I don't respond but this is just false

SPAM filters on SMTP ports were implemented long long before any government mandated it - at the ISP level often

Further, the development DKIM and SPF, were incidental to any government requirement

Preemptively: The fact that the early internet was heavily government and institutionally focused, doesn't a government mandate make

kelseyfrog · on Aug 29, 2022

We can look back further at the Hays code. That's just religion plain and simple. The feeling of, "we're sliding into a decadence which will lead to the downfall of our civilization" is a meme propagating this very sentiment. It's not a simple as just the government, but that does co-occur.

angusturner · on Aug 29, 2022

Developing models that can predict if stuff is harmful ironically makes it easier for people to optimize for harm.

e.g. the one line of code in Stable Diffusion that predicts if stuff is NSFW, can be inverted to generate only NSFW stuff.

I tend to agree with OP that there is no technical solution to this problem.

6510 · on Aug 29, 2022

With some further refinement real harm could be done. Think of an infinite short video feed that is both irresistible and gradually modifies you.

naillo · on Sept 1, 2022

Isn't this what infinite jest was entirely about

tintor · on Aug 29, 2022

Solution is an arms race, by keeping your algorithmic improvements hidden.

whywhywhywhy · on Aug 29, 2022

Isn’t that basically what OpenAI and Google tried to do and it lasted all of 3 months.

Problem with tech is once it’s known to be possible if you choose to try and monetize it by making it public as OpenAI and Google were planning to do then it’s only a matter of time before another smart team figure out how you’re doing it.

You can do the Manhattan Project in secret and in 500 years someone else might not realize it’s possible. But the second you do a test of that concept the sign you did that is detectable everywhere and the dots of what you did will connect in someone’s brain somewhere.

Can’t put the genie back in the bottle.

pfortuny · on Aug 29, 2022

In England you discover that English has actually two different existences… The ordinary one and then the “dirty” one. Almost any word has or can be made to have a “harmful” meaning…

Useless effort.

dumpsterlid · on Aug 29, 2022

"It's entirely pointless. Humans are vastly superior in their ability to subvert and corrupt. Even if you were able to catch regular "harmful" images humans would create a new categories of imagery which people would experience as "harmful", employ allusions, illusions, proxies, irony etc. It's endless."

This is employing a fallacy that people have infinite amounts of energy and motivation to devote to being hateful. I have been on countless online communities in video games and elsewhere and when the chat in them doesnt allow you to say toxic, hateful stuff... guess what a whole lot less of that shit is said. Are there people who get around it by changing out characters to ones that look the same that dont trigger the censor or by using slang or by mispelling? Of course but the fact is I think if you talk to someone who runs communities like this they would laugh in your face if you said a degree of censorship of hate speech wasn't fundamentally beneficial.

A big aspect has got to do with the fact that if everybody agrees to be part of a community, part of that agreement is a social contract not to use hate speech and if someone flaunts that they are bypassing it.. in the obvious flaunting of the social contract established (it is obvious they had to purposely mispell the word) these people are alienating themselves by underlining the fact that the 99% of the community finds their behavior pathetic and unacceptable.

AndrewKemendo · on Aug 29, 2022

I (and I would assume the OP) agrees that saying "entirely pointless" may be a bit hyperbolic

However the point stands that as a concept, humans will find a way to exploit and corrupt any technology. This is unquestionably true.

Bertrand Russell famously makes exactly this point as well, albeit specifically when it comes to violent application of technology in war. That: until all war is illegal every technological development will be used for War.

Your point however is also true, in that in certain spaces for certain audiences (communities), participants make it more difficult to exploit these things in ways that they don't want to and to explout them in ways they do.

Ergo, Technology is and remains neutral (as it has no will of it's own) and the people using and implementing technology are very much not neutral and imbue the will of the user onto the tool.

The real question you should be asking is, how powerful can a free tool/knowledge get before people start saying that only certain class of "clerics" can use it or that most communities agree that NO community should have it.

Notice on that last point how not-hard we're trying to get rid of Nuclear Weapons

Al-Khwarizmi · on Aug 29, 2022

I don't think swearing in a video game is comparable to art.

If I swear at a video game and it comes out as ** I might think "OK, maybe I'm being a bit of an asshole, there could be kids here and it's a community with rules so I'll rather not say that".

If a tool to make art doesn't let me generate a nude because some American prude decided that I shouldn't, though... my reaction is going to be to fight the restriction in whatever way I'm able.

XorNot · on Aug 29, 2022

Importantly, we're posting on a forum where this exact idea is true. HN doesn't stop all hate speech, or flaming and what have...but the moderation system stops enough that people generally don't bother.

angusturner · on Aug 29, 2022

It seems pretty well-agreed that the HN moderation works because of dedicated human moderators and community guidelines etc.

I think spaces that effectively moderate AI art content will be successful (or not) based on these same factors.

It won't depend on some brittle technology for predicting if something is harmful or NSFW. (Which, incidentally, people will use to optimize/find NSFW content specifically, as they already do with Stable Diffusion).

epups · on Aug 29, 2022

But this is a forum of interaction between people. These models can and should do things privately. It's the difference between arguing for censorship in HN or Microsoft Word.

torotonnato · on Aug 28, 2022

Another example of funny subvertion, Chinese style: https://www.wikiwand.com/en/Baidu_10_Mythical_Creatures

AndrewKemendo · on Aug 29, 2022

I would argue that a group of humans is actually superhuman at corrupting and subverting any given system

What a great insight which I find both extremely poignant and simultaneously disheartening

Perhaps this will come as a comfort to the people who are vehemently against creating human-level AI systems

zzleeper · on Aug 28, 2022

I naively asked for a "sperm whale opening its mouth in the middle of the ocean" on DALL-E and got a warning :/

worldsayshi · on Aug 28, 2022

Sure it would be a fools errand to filter out "harmful" speech using traditional algorithms. But neutral networks and beyond seems like exactly the kind of technology that is able to respond to fuzzy concepts rather than just sets of words. Sure it will be a long hunt but if it can learn to paint and recognize a myriad of visual concepts it ought to be able to learn what we consider to be harmful.

ziddoap · on Aug 28, 2022

One of the insurmountable problems, I think, is the fact that different people (and different cultures) consider different things 'harmful', and to varying degrees of harm, and what is considered harmful changes over time. What is harmful is also often context-dependent.

Complicating matters more is the fact that something being censored can be considered harmful as well. Religious messages would be a good example of this; Religion A thinks that Religion B is harmful, and vice-versa. I doubt any 'neutral network' can resolve that problem without the decision itself being harmful to some subset of people.

While I love the developments in machine learning/neural networks/etc. right now, I think it's a bit early to put that much faith in them (to the point where we think they can solve such a problem like "ban all the harmful things").

JohnHaugeland · on Aug 29, 2022

This doesn't actually affect the system at all

There's way too much moralizing from people who have no idea what's going on

All the filter actually is is an object recognizer trained on genital images, and it can be turned off

The prompt isn't very relevant. I've had the filter fire on completely innocent text

The filter is a simple checkbox in preferences. All this deep thought is missing the point. You can just turn it off

ziddoap · on Aug 29, 2022

>There's way too much moralizing from people who have no idea what's going on

>All the filter actually is is an object recognizer trained on genital images, and it can be turned off

I'm not sure if you misread something, but neither I or the person I was replying to was talking about this specific implementation, but in a more general sense?

I'm pretty sure you are the one who missed the point of the parent post and mine.

JohnHaugeland · on Aug 30, 2022

Cool story.

Anyway, the filters you're thinking about don't exist and you can download the code and use it today.

Thanks for speculating.

epups · on Aug 29, 2022

It's not that simple. The model was not trained to recognize "harmful" action such as blowjobs (although "bombing" and other atrocities of course are there).

JohnHaugeland · on Aug 29, 2022

It very much is, actually, that simple

The model was trained on eight specific body parts. If it doesn't see those, it doesn't fire. That's 100% of the job.

I see that you've managed to name things that you think aren't in the model. That's nice. That's not related to what this company did, though.

You seem to be confusing how you think a system like this might work with what this company clearly explained as what they did. This isn't hypothetical. You can just go to their webpage and look.

The NSFW filter on Stable Diffusion is simply an image body part recognizer run against the generated image. It has nothing to do with the prompt text at all.

epups · on Aug 29, 2022

The company filtered the LAION 5b based on undisclosed criteria. So what you are saying is actually irrelevant, as we do not know what pictures were included or not.

It is obvious to anyone who bothers to try - have you? - that a filter was placed here at the training level. Rare activities such as "Kitesurfing" produces flawless, accurate pictures, whereas anything sexual or remotely lewd ("peeing") doesn't. This is a conscious decision by whoever produced this model.

JohnHaugeland · on Aug 29, 2022

If you're done, emad already discussed this

epups · on Aug 30, 2022

Who the hell is emad

CrazyStat · on Aug 29, 2022

>it ought to be able to learn what we consider to be harmful.

Who is "we"? Do "we" consider nude paintings to be harmful?

Is "we" Mike Pence? Roman Polanski? Woody Allen?

There is no coherent "we" and no consensus on what "we" consider harmful, so no AI can possibly learn that.

worldsayshi · on Aug 29, 2022

Well it ought to be able to be trained for a number of scenarios and then on generation be told to generate based on certain cultural sensibilities. It's not going to be perfect but probably good enough?

Isn't this part of the AI alignment problem? To be able to understand what kinds of output is unacceptable for a certain audience? To be polite?

CrazyStat · on Aug 29, 2022

> Well it ought to be able to be trained for a number of scenarios and then on generation be told to generate based on certain cultural sensibilities. It's not going to be perfect but probably good enough?

Do we want the AI to generate based on Polanski's sensibilities, even if he's the only audience member? I suspect for most people the answer is no.

macrolime · on Aug 29, 2022

arvinsim · on Aug 29, 2022

Agreed. Anything can be lewd if one has a "green" mind.

rich_sasha · on Aug 29, 2022

Cue aubergine and peach emoji.

mrtksn · on Aug 28, 2022

I find it very immoral too, it's like the islamist trying to prevent the prophet pictures drawn. Not that I want to offend muslims or make "harmful" content but this notion that specific type of content creation needs to be imposed is very very problematic. Americans freak out of nudity all the time, something that is not considered harmful in many other places. The fear of images and text and the mission to restrain it is pathetic.

Anyway, it won't be possible to contain it. Better spend the effort on how to deal with bad actors instead of trying to restrain the use of content creation tools.

derac · on Aug 28, 2022

Yeah, it's taking the impulse to control everything from our own mind and putting it into an artificial one. Seems to me a lot of our suffering is borne of that impulse.

modeless · on Aug 28, 2022

OpenAI's filters are a total joke. I tried to upload The Creation of Adam (from the Sistene Chapel), blocked for adult content. "Continued violations may restrict your account". Yeah, it has naughty bits in it, but it's probably in the top ten most recognizable pieces of art ever made. I tried to generate an image of "yarn bombing", blocked for violence. They have the most advanced AI in the world and they can't solve the Scunthorpe problem?

vintermann · on Aug 29, 2022

They're not content filters as much as Doing Something filters. They're there to convince people that they're doing something, and of course if it wasn't zealous and regularly tut-tutted people for desiring a rubber duck, you wouldn't know they were doing something.

geoah · on Aug 28, 2022

I was repeatedly warnings by gpt3 for trying to create images of a “rubber duck”. No idea what it thought I was looking for.

throwaway290 · on Sept 1, 2022

https://en.wikipedia.org/wiki/Rubber_Duck_(sculpture)#Censor...

foobarbecue · on Aug 28, 2022

Maybe because "rubber" is a term for condom? If so, yikes.

justinjlynn · on Aug 28, 2022

The most advanced AI in the world isn't advanced enough to solve that, just yet. Either that, or it's not worth it for them to use it to do so.

xt00 · on Aug 28, 2022

The reason why this is such a game changer is that it is not controlled on some central server.. its like saying paper and pencils can be revoked from people if somebody doesn't like what you do with it... its an amazing new technology.. let people use it..

tjs8rj · on Aug 28, 2022

Regardless of the practicality: why do they think it’s their role to be the morality police?

If there’s anything we’ve learned from history, it’s that we’ve always been morally wrong in some way, very often in our most strongly held beliefs. This AI in a different time would be strictly guided to produce pro-(Catholic Church/eugenics/slavery/racist/nationalist) content.

worldsayshi · on Aug 28, 2022

I think they are just afraid of bad publicity. We remember some AI experiments mostly for their ability to generate profanity.

hoseja · on Aug 29, 2022

And the corporate creators freaking out about the profanity. Microsoft's Tay wouldn't be remembered so fondly if Bill didn't immediately pull the plug when channers made her say the n-word.

belltaco · on Aug 29, 2022

Bill?

acdha · on Aug 29, 2022

> Regardless of the practicality: why do they think it’s their role to be the morality police?

It’s not just morality - there reportedly have already been multiple subreddits of non-consensual porn trying to mimic real people and underage porn. The legality of that is a minefield but it doesn’t end there. If that’s what they become known for it affects funding, hiring, people deciding whether to use their software, etc. and the more prominent that is the more likely that they’ll be hauled before legislators to talk about problems. Even simple things like legal demands to remove celebrities from the training sets could be pretty time-consuming.

adhesive_wombat · on Aug 28, 2022

Right? Any keyboard can generate "harmful" content, do we need to figure out how to prevent "harmful generations" at the USB HID level?

orbital-decay · on Aug 28, 2022

I half expect this could be a genuine startup these days.

robocat · on Aug 28, 2022

Run the filter on the image output, not the written input?

jwitthuhn · on Aug 28, 2022

Stable diffusion does run a filter on the output in its default configuration. Any image it deems 'unsafe' gets replaced with a picture of Rick Astley.

The thing about that is that it is open source, so you can trivially disable that filter if you like.

https://github.com/CompVis/stable-diffusion/blob/69ae4b35e0a...

hertzrat · on Aug 28, 2022

How Orwellian. It’s like newspeak: make it impossible to express certain thoughts

rich_sasha · on Aug 29, 2022

Reminds me of a joke: three guys get locked up for a long time. Out if boredom they start telling jokes to each other, but as the supply is finite, they are retelling them all the time. Eventually they number them, then just shout out eg "27", and they are all laughing.

Then a new inmate joins, doesn't know what's going on but figures that if you say a number, people laugh. So he goes "14!". But nothing happens. The others tell him "you didn't tell the joke right".

How is the poor AI meant to know that jokes 6, 13 and 38 are sexist?

pzone · on Aug 29, 2022

That’s a good joke though I prefer the punchline

“It’s all in the delivery.”

yieldcrv · on Aug 28, 2022

I was once a guest at a tech think tank, early 2000s, people all in their 60s at the time

They spent years grappling with online worlds because of the idea that people might/could represent themselves as a different gender, they wanted the technology to exist and had dreamed about it for decades they just got caught up on that

That was comical because it was also out of touch at the time period as well

Its interesting how people squirrel and spiral over useless things for some time

ironmagma · on Aug 28, 2022

It’s impossible and futile, but that has never stopped legislators or attorneys before.

JohnHaugeland · on Aug 29, 2022

It's not impossible or futile. It works well today and is entirely voluntary. No lawyers or lawmakers were involved.

You can turn it off if you want to; it's a simple convenience.

jdougan · on Aug 29, 2022

Examples from the past, chat with blocks: http://habitatchronicles.com/2007/03/the-untold-history-of-t...

systemvoltage · on Aug 28, 2022

People need to read John Carmack and John Romero's epic adventures of Doom: https://www.amazon.com/Masters-of-Doom-David-Kushner-audiobo...

Even in the 90's they had to fight hordes and hordes of Californian nutjobs (Diane Feinstein et. al.) that wanted to ban violent video games. These people would be certainly cancelled in today's world, wouldn't hold a chance. Because, how dare you allow violence in video games to ...children!?

Our civilization depends on allowing wacko's do their thing as far as it is within limits of the law. Let them be offensive as fuck. These are the people that herald and propel society forward by their heterodox thinking. Society is going to decay fast, it already is.

JaimeThompson · on Aug 28, 2022

>hordes of Californian nutjobs (Diane Feinstein et. al.)

Lots of people from other states, including Texas, in that list too. It wasn't just a California / Left issue.

systemvoltage · on Aug 28, 2022

Yea definitely when they started a studio in Dallas, I don't remember the congress persons that were on similar stance as Diane. During the 90's, progressives played a larger role though. There was also Mortal Kombat fiasco:

> During the U.S. Congressional hearing on video game violence, Democratic Party Senator Herb Kohl, working with Senator Joe Lieberman, attempted to illustrate why government regulation of video games was needed by showing clips from 1992's Mortal Kombat and Night Trap (another game featuring digitized actors).

https://en.wikipedia.org/wiki/1993_United_States_Senate_hear...

dahart · on Aug 29, 2022

> During the 90’s, progressives played a larger role though.

Could be true, maybe, but today conservatives have willingly taken over that seat, and the NRA is heavily involved and actively blaming video games after each mass shooting to deflect from the debate on gun rights. https://www.usgamer.net/articles/the-nras-long-incoherent-hi...

In terms of trying to moderate swearing and sexuality in games and music and movies, the religious right has long been and still is the group most vocally opposed to such free expression... if we’re talking about where to address censorship today.

systemvoltage · on Aug 29, 2022

Why does this matter? Regardless of the party, my original message stands. It is an irrelevant detail. Not sure what's causing defensiveness everytime I bring up or criticize progressives. My bad I only remembered Diane Feinstein's name from the book, jeez.

dahart · on Aug 29, 2022

Oh I thought you were suggesting we should stop censoring legal but offensive behavior? The issue of exactly who’s doing the censoring seems absolutely and completely relevant to the subject of censorship, no? If it’s irrelevant, then I don’t understand the point of your top comment. Why do we need to allow offensive wackos to do their thing, what offensive things are we talking about, and who needs to allow them?

Perhaps a more important discussion, if you do care about censorship, is to define more thoughtfully what you mean about “within the limits of the law”. In the US, the law, up to and including the constitution, makes clear that offensive behavior is anywhere from not protected free speech up to criminal activity. Politicians are debating what the limits of the law should be, and sometimes they blow hot air, and sometimes they write bills. Either way, the results of Congressional bills are establishing the limits of the law, and so define the acceptable legal bounds of offensive media & speech. Here’s one of the bi-partisan congressional sessions on games (it included Feinstein, among many others, but she didn’t testify). https://www.govinfo.gov/content/pkg/CHRG-109shrg28337/html/C...

UncleMeat · on Aug 29, 2022

You made an explicit point of highlighting this "irrelevant detail" in multiple posts.

systemvoltage · on Aug 29, 2022

In response to people jumping on to defending progressives of the 90's. The amount of defensiveness that's invoked here on HN for stating the facts is quite alarming.

I really should have left out the Diane Feinstein and "California nutjobs" in the original post. This is what happens when you mistakenly poke HN every single time when it comes to political one-sidedness.

The fault is mine.

vintermann · on Aug 29, 2022

The original Doom had "Italian cannibal film" levels of gore, heavily pixelated of course (not as if they had a choice in 1992), but in such that you could see that it was scans. Plus of course a lot of over-the-top satanic cliches to tick off the fundamentalists. But nothing remotely sexual - that's a bridge too far in the US.

daenz · on Aug 28, 2022

You make a great point... we can't stop the decay, so the growth has to outpace it.

JohnHaugeland · on Aug 29, 2022

Dianne Feinstein never attempted to control video games or doom. She just said that she was worried about the impact once, in April 3 2013, and Fox News has been screaming her name ever since. She's never introduced any law about this at all.

Only one California politician has ever attempted to do much of anything to video games: republican Joe Baca who tried a dozen times, and is mostly famous for his attempt in 2009 to get a warning sentence on boxes. Calling that censorship is pearl clutching

The only genuine attempt to do something an adult would consider censorship to video games were Jack Thompson, now banned republican, or that brief 2018 thing with Trump.

Democrats have never attempted to censor video games. All three major attempts were Republican.

It's important to get the details right if you are going to build an intuition of who's actually doing this

systemvoltage · on Aug 30, 2022

I’m sorry but none of what you said is true. At this point, the facts are indisputable. Check out my other replies that point to the congressional hearings.

JohnHaugeland · on Aug 30, 2022

Respectfully, coming along and saying "nuh-uh, go find what I said somewhere else" isn't in any way compelling.

The evidence is easy to look up, and you didn't give me any.

quitit · on Aug 28, 2022

I don't see the point. Idiots are fooled by far less convincing images.

Humanity has had the ability to lie with pictures since the invention of photography. The field of special effects can be described as lying about things that don't matter.

Without using Stable Diffusion, I can still photoshop an image or deepfake a video. Stable Diffusion isn't really changing what's possible here, and arguably is less advanced than what's possible with Deepfakes or even the facial filters available on social networks.

Like with all deceptive imagery: one just needs to use their noggin.

* Also I might add: the article is actually out of date on some aspects, because this technology is evolving so rapidly. Literally every day there is a new and interesting way that people are applying the tech.

JohnHaugeland · on Aug 29, 2022

It's no different than Google images, which is also voluntarily polite by default.

In both tools you can get naughty images, but you have to tell the tool that's okay.

This is not about censorship or moralizing.

It is just having the tool know when it's allowed to do that stuff. It's a key basic product feature if you're actually using the thing for content and not just having fun making pictures

Everyone acting like there's some kind of free speech issue should go into their account and turn the filter off, then try to calm down

donpark · on Aug 29, 2022

It makes sense if the intent is to protect Midjourney from being blamed for misuse. If they saw the potential misuse yet chose to do nothing about it, they'd be blamed. Lack of perfect solution is not an excuse for not offering any protection.

isaacfrond · on Aug 29, 2022

MS Word was introduced almost 40 years ago. You wouldn't believe the filth it still allows you to create.

thrwawy74 · on Aug 29, 2022

I literally spent the whole first 3 hours figuring out ways to generate porn. They don't allow words like sex, cock, etc so you use prompts like intercourse and phallus. At one point I thought they were screening for particular names so you'd say things like "the brother of mako in the legend of korra" instead. It's just an endless game of cat and mouse not worth putting effort into. Got bored, now I'm playing with the dev api. People have been showing how to integrate this into Photoshop and Gimp and it's pretty cool.

truffdog · on Aug 29, 2022

Try using character names from Overwatch

GaggiX · on Aug 29, 2022

you can run the model on your own, then it became to easy to generate porn to be honest

JohnHaugeland · on Aug 29, 2022

It's just you.

The goal is to have a checkbox which keeps the system from generating naughty images in casual use.

This has absolutely nothing to do with censorship. It's a nonsense concept and it's not clear what you think censorship actually is.

If you set the system to make tall rectangles, are you censoring squares?

It's absolutely exhausting how people on HN attempt to cast any form of telling a tool what you want the tool to make as if you're somehow morally governing something

It's just telling the machine what to make

Not everything is a desperate ethical dilemma

Sometimes you just want the things you create to be straightforwardly usable

You understand that the filter is voluntary, and that the initial delay requirement (long gone) was about Discord adult image rules, right?

You're not just moralizing censorship by habit where there was none, trusting hn to overreact when that word was abused, right?

xg15 · on Aug 29, 2022

Agreed. I'm also not sure how this is practically supposed to work if they really publish the entire model. Right now, all they do is design a specific license, right? Or are there certain safeguards built into the model itself?

That being said, I'd still think publishing the model (vs keeping it as a closed-source API) is a good move. Otherwise, we'd move forward into a world where one of tge most significant technological advancements must be gatekept forever, which I'd frankly find even more dystopic.

dumpsterlid · on Aug 28, 2022

Well, it depends.. are you talking about significantly mitigating harmful uses of stable diffusion or completely stopping them? The latter... of course it isnt going to happen but there are plenty of practical things that can be done to mitigate.

hackerlight · on Aug 29, 2022

> impossible and futile task

If we can't even do this, how are we ever going to align AGI? I see these efforts as part of a nascent effort at alignment research (along with the more proximate reason, which is avoiding bad PR from model misuse).

buildbot · on Aug 28, 2022

Yeah, the best they can do is the filter's on top of the output. These models are complex enough that with some reverse engineering you can find "secret" languages to instruct them that would be able to get around input filtering.

adhesive_wombat · on Aug 28, 2022

AI Engine Optimisation could be a good consultancy gig. Figure out how to get your clients the results they want by gaming the rules and filters.

Reminds me of the mysterious control of Conjoiner Drives in Alastair Reynold's books.

zmmmmm · on Aug 28, 2022

devil's advocating, given they have trained it so well to generate images in spite of all expectations, is it really so hard to imagine that they can't also train it to understand what images not to generate? It already had to understand not to generate things that don't make sense to humans. How does this not just amount to "moar training"? The hardest thing is that the training data it will need is a gigantic store of objectionable (and illegal) content ... probably not something many groups are eager to build and host.

foobarbecue · on Aug 28, 2022

No, it's easy... just teach it Asimov's 3 laws :-)

dqpb · on Aug 29, 2022

It’s not just about the futility of it, it’s also about correctness. Are breasts unsafe? According to Open AI they are.

armchairhacker · on Aug 28, 2022

The thing is that people can make harmful art themselves. Photoshopping people's faces on nudes and depicting graphic violence has been a thing since digital photography if not painting in general. I mean, look at all the gross stuff which is online and was online way before these Neural Networks.

The issue with these neural networks isn't the content they create, it's that they can create massive amounts of content, very easily. You can now do things like: write a Facebook crawler which photo-shops people's photos on nudes and sends those to their friends; send out mass phishing emails to old people with pictures of their grand-kids bloody or in hostage situations; send out so many Deepfakes for an important person that nobody can tell whether any of their speeches is legitimate or not. You can also create content even if you have no graphic design skills, and create content impulsively, leading to more gross stuff online.

Spam, misinformation, phishing, and triggering language are already major issues. These models could make it 10x worse.

rcoveson · on Aug 28, 2022

Where today it takes some far-from-Jesus deviant artists a whole day to draw a picture of Harry Potter making out with Draco Malfoy, with the power of AI, billions of such images will flood the Internet. There's just no way for a young person to resist that amount of gay energy. It's the apocalypse fortold by John the Revelator.

adhesive_wombat · on Aug 28, 2022

> It's the apocalypse fortold by John the Revelator.

I literally read a chapter of Inhibitor Phase where there's a ship called "John the Revelator" less than an hour ago. I haven't otherwise seen that phrase written down for years.

Spooky (and cue links to the Baader-Meinhof Wikipedia article).

goldenkey · on Aug 29, 2022

For context: https://youtu.be/5hucTDV1Fvo

https://en.wikipedia.org/wiki/John_of_Patmos

https://en.wikipedia.org/wiki/John_the_Revelator_(song)

saurik · on Aug 29, 2022

> Spam, misinformation, phishing, and triggering language are already major issues. These models could make it 10x worse.

Or 10x better, as the barriers to entry for doing this kind of thing right now aren't high enough to make it not happen... they are only high enough to make it sufficiently hard to pull off that people can feel comfortable assuming that most of the content they see is legitimate; in a world where nothing is necessarily legitimate I'd expect you'd see a massive shift in peoples' expectations.

afpx · on Aug 28, 2022

After generating 5000 images with these tools, I believe the killer app will be the one that gives the artist the most control. I want a view and a scene and be able to manipulate both in real time.

Like,

View: 50mm film, wide-angle

Scene: rectangular room with window -> show preview

Scene: add table -> show preview

Scene: move table left -> show preview

Scene: add mug on table -> show preview

View: center on mug

Right now, there’s little control and it’s a lot of random guessing, “Hmm what happens if I add these two terms?”

gamegoblin · on Aug 28, 2022

Have you seen the img2img results? You draw kind of a crappy Microsoft Paint style image, give it some text for how you want it to actually look, and it does the transformation.

For example: https://www.reddit.com/r/StableDiffusion/comments/wwgge8/ano...

Consider also this example of someone splicing Stable Diffusion into a proper image editor and using a combination of img2img, text to image, inpainting, and normal photoshop tools: https://www.reddit.com/r/StableDiffusion/comments/wyduk1/sho...

fragmede · on Aug 29, 2022

Try out Img2Img easily at https://huggingface.co/spaces/huggingface/diffuse-the-rest

orbital-decay · on Aug 28, 2022

The natural language alone is one of the worst ways to control image generation. The model knows how to generate anything, but it's own "language" is nothing like yours. It's like writing in Finnish, twisting it in such a way that it would yield coherent Chinese poems after Google Translate. You will end up inserting various garbage into your input and not getting the result you like anyway. img2img gives much better result because you can explain your intent with higher order tools than just textual input alone.

What would be best is to properly integrate models like that into some painting software like Krita. Imagine a brush that only affects freckles, blue teapots, fingers, or sharp corners. (or any other thing in a prompt) Or a brush that learns your personal style and transfers it onto a rough sketch you make, speeding up the process. Many possibilities.

I think they are already making an img2img plugin for Photoshop. Watch the demo, it's kind of impressive. [0] It's just a rudimentary prototype of what's possible with a properly trained model, but it already looks like a drop-in replacement for photobashing (as an example).

https://old.reddit.com/r/StableDiffusion/comments/wyduk1/sho...

thom · on Aug 29, 2022

Someone claims to have made a Stable Diffusion PhotoShop plugin:

https://www.reddit.com/r/StableDiffusion/comments/wyduk1/sho...

Doesn’t match the workflow you’re describing exactly, but shows how this stuff can be integrated somewhat smoothly into a UI.

redox99 · on Aug 29, 2022

It's all about generation time. If generation was faster, the UI could preemptively show you a lot of variations based on suggested keywords. And also you could click things and get immediate results.

Currently it takes my mid end PC (2070 Super) 10 seconds per image, which is too slow. You would need to get generation time below 1 second to be quite productive. I guess you can already achieve that with something like triple 3090s?

spywaregorilla · on Aug 28, 2022

I think the ideal UX will be the ability to markup images with little comments and have it adapt accordingly. The prompt interface is bad. One of the biggest reasons being that you have virtually no control on the spatial aspect of your additions. Being able to say "add an elephant here and remove this lamp" will be big. Being able to do so with a doodle of an elephant to suggest posing will be even better.

adhesive_wombat · on Aug 28, 2022

Reminds me of the holodeck scene where Picard(?, Edit, Geordi) reconstructs a table with what I, at the time, thought was a pretty vague set of specifications.

Turns out the Star Trek predicted 2020's style AI behaviour rather well. Considering nuclear war is then due in 2026, that's disconcerting.

twoodfin · on Aug 28, 2022

You’re thinking of Riker, LaForge, Worf, and an (unnamed?) civilian as assisted by Troi in season 6’s “Schisms”.

I find the best holodeck “prompt” scene to be Picard explaining how he’d like to experience the world of Dixon Hill in “Manhunt”:

https://youtu.be/p7pPedBtbvk

goldenkey · on Aug 29, 2022

The best holodeck prompt was when Moriarty gets created with an intelligence that can rival Data and he takes control of the enterprise. :-)

https://youtu.be/msjQKkkW2Wo

adhesive_wombat · on Aug 29, 2022

An odd one, that. After all the lore (geddit?) about Data and his brother being unique and special for their unrivalled artificial intelligence, it turned out all you have to do to exceed that is just vaguely ask a standard-issue ship computer to do so.

goldenkey · on Aug 29, 2022

I think the size of the enterprise and its fusion reactor is quite an unfair advantage. Was Data really supposed to be smarter than the enterprise especially when it can read Data's mind state in order to fulfill the prompt?

adhesive_wombat · on Aug 29, 2022

I suppose the EMH is (or at least was, pre-mobile emitter from the future) a thin client for the Voyager computer.

Still seems odd that it's only apparently Data, Moriarty and the Doctor that have demonstrated the Federation actually can make pretty general AI with the tools it already has on starships (and conveniently always on the ship with all those film crews on it making the Historical Records).

Surely under the crust of some demon class planet there's a bank of millions of times that power bring used for...something.

There's probably a rule against making AI that you're allowed to break in the delta quadrant though.

twoodfin · on Aug 29, 2022

There’s no direct canon confirmation, but it seems quite plausible that it was, in fact, the Bynars who provided the technological leaps necessary for the Enterprise computer to generate Moriarty and other proto-sentient characters. Riker and Picard both comment on the realism and perception of Minuet, created by the Bynars on the holodeck after their upgrades.

And there is a direct canon line from Moriarty through to the EMH and later sentient holograms via Lt. Barclay.

ilaksh · on Aug 29, 2022

I make tools for artists and am afraid to incorporate AI generation because I am pretty sure then everyone will just discount work creates with my tool, assuming all of it was AI generated, and then no artists will want to use it.

What I am actually leaning towards is a tool for users to "enhance" art with AI, but only if the artist allows it.

brundolf · on Aug 28, 2022

More generally: continuity from one image to the next

If you want a set of images with the same artistic style for example, especially a distinctive one, that can be hard to do

If you want a set of images starring the same recognizable character or object, in eg. different situations, that's gonna be real hard to prompt

astrange · on Aug 29, 2022

You can do this in SD by fixing the random seed; only reason earlier models can’t is their UIs hide the seed from you.

More control is coming: https://dreambooth.github.io/

moffkalast · on Aug 28, 2022

> I believe the killer app will be the one that's available for free

FTFY

Dall-E only really became known after Dalle-E Mini was used to flood the internet with memes.

uwuemu · on Aug 29, 2022

All these harmful this and unsafe that... I don't get it. What is the goal here? Are they trying to make it resistant to any "vulgar" inputs (I put "vulgar" in quotes because I can 100% see how they may consider strongly political statements as "vulgar" too)? Or prevent it from producing pornography? Or CP? The last one should be fairly straightforward (kid + nudity = get it out), but the first two are very broad limitations. I understand if you want to make it not super racist, but with pornography, I'm not sure it has a point... even more so given the fact that erotic art is very common in the real world and I'm pretty sure this model (and others like it) can't distinguish between erotic art and pornography (which you can't blame it for given the legal standard seeming to be "I recognize it when I see it"), meaning it can only ever be used to produce SFW imagery scrubbed of any "risque" factors or themes, because the authors didn't want to deal with any kind of erotic art due to this impossibility of separating erotic art from pornography.

If the point of all of these models is to get to something resembling an artist, then why intentionally kneecap it from the start and prevent it from producing art?

audunw · on Aug 29, 2022

> All these harmful this and unsafe that... I don't get it. What is the goal here?

I'm not sure, but they mention biases. So I imagine one thing they want to avoid is that you ask for a drawing of a "criminal" and 90% of the images are of colored people. It should be possible to avoid to minimize these if you review the dataset, at least for certain key words.

buildbot · on Aug 28, 2022

I have been playing around with it using ROCM+6900XT, makes a good alternative to DALLE. They have different strengths, DALLE seems better at lighting instructions and cityscapes, but Stable Diffusion is better at sketches.

Also, you can fine tune it on whatever you want which is awesome.

One interesting effect I have noticed on myself though is that after staring at DALLE or Stable Diffusion generated images for a long time then viewing "real" media, I get the same sense of wrongness that the output is not quite right for awhile, like my brain has been tweaking its processing to prefer the AI art as the ground truth!

cube2222 · on Aug 28, 2022

That's funny, for me dalle2 is in practice miles ahead on pencil sketches, but stable diffusion is cool because the parameters can be customized, which helps with many phrases. Also, you can just leave it running and producing images for an hour.

Also, there's no content filtering, but I don't recommend playing around with that if you're sensitive. The lifeless husks and various mixes of body parts I got when playing around with it with fairly benign phrases could very well be used for a horror movie.

It might be that I haven't yet found the right phrase for stable diffusion for pencil sketches though, as for dalle2 it's just "<describe what you want>, artstation, pencil sketch, 4k" to generate consistently great pictures.

at_a_remove · on Aug 28, 2022

4chan is having a field day with AI generated porn of celebrities (often with ridiculous prompts) and selecting the most unsettling. One for Billie Ellish looks like some kind of orphaned shoggoth/succubus hybrid just made its first attempt at luring in someone for a meal: "You like human females, yes?" Cataract eyes, aggressive lobotomy mouth, it forgot to pay attention to shoulders and didn't know spines existed or what they were for. Or a second attempt, this time at Bjork, suggesting some kind of lost hominid which consumed only melons in a predator-rich environment.

71a54xd · on Aug 28, 2022

Yep, I've found the same to be true. Hopefully at some point the model is optimized to introspect complex lighting / textures a bit better.

dkjaudyeqooe · on Aug 28, 2022

I dub it Induced Uncanny Valley Syndrome.

goldenkey · on Aug 29, 2022

For those who want to see what kind of erotica it's capable of, here's my tasteful creations:

https://www.instagram.com/p/Ch0PHzvvXgr/?igshid=YmMyMTA2M2Y=

https://www.instagram.com/p/ChwuZlmuJSV/?igshid=YmMyMTA2M2Y=

https://www.instagram.com/p/ChwZmUhukK_/?igshid=YmMyMTA2M2Y=

https://www.instagram.com/p/ChusQWELB-x/?igshid=YmMyMTA2M2Y=

https://www.instagram.com/p/ChureAOL2H0/?igshid=YmMyMTA2M2Y=

The first links are most recent. You can see the progress I've been making as I learn to do better prompt engineering and iterate on existing images by using img2img. The future is here...

captainmuon · on Aug 29, 2022

So, can someone explain the license to me? I read it and it seems very reasonable. It excludes most of the bad use cases, and doesn't restrict interesting but controversial use cases too much. However,

> To generate or disseminate verifiably false information and/or content with the purpose of harming others;

How do they define that? So I can generate disseminate false information without the purpose of harming others, just for fun? And what if I believe I am not harming others, but helping others? Can I generate fakes to further my political cause, if I'm convinced it is a "good" cause? And what about Popper's paradox? If I prevent people from harming others, then I am still causing the harmer's harm. I feel they are opening a can of worms here.

Also, bad actors will just ignore the license. There is a piece of code that censors obscene generations, you could just comment that out. I feel the license and that filter are not going to stop anybody, but are mostly there for good publicity and so they can wash their hands in innocense...

Aardwolf · on Aug 29, 2022

> Stable Diffusion Is the Most Important AI Art Model Ever

at the bottom another linked article:

> BLOOM Is the Most Important AI Model of the Decade

All very interesting but it's a bit hard to take serious if they exaggerate the titles so much that they contradict each other

mpaepper · on Aug 29, 2022

If you want to read an enthusiastic, but less exaggerating piece which explains how the tech works: https://www.paepper.com/blog/posts/how-and-why-stable-diffus...

naillo · on Aug 29, 2022

Also funny that their first sentences in the second article is

> "You may be wondering if such a bold headline is true. The answer is yes."

Feel pretty save to tag 'The Algorithmic Bridge' as not something to pay attention to in the future.

whywhywhywhy · on Aug 28, 2022

I’ve generated over 1000 images in the last 48 hours. It’s better and faster than using Dall-E, I can literally just leave a prompt churning away in the background for the same costs of playing a high end videogame and check on the results when I want.

Honestly if I was a commercial concept artist or illustrator that didn’t have a signature style I’d be really worried. We’re truly gonna see the power of this tech as a tool now it’s not gatekept.

bitshiftfaced · on Aug 29, 2022

> Honestly if I was a commercial concept artist or illustrator that didn’t have a signature style I’d be really worried.

The prices people pay for any kind of picture art are about to take a nosedive. Stock art websites are going to be hit hard, any kind of graphics artist, any kind of commissioned artist. I wonder if (human) models will be taking a pay cut as a result.

pg5 · on Aug 29, 2022

For album art, I've found MidJourney is the best of the three. However, I am mainly generating creepy stuff due to my musical genre.

whywhywhywhy · on Aug 29, 2022

I heard Midjourney is adding extra prompts to anything you submit that give it the signature style is has. Pretty sure you could get the same style stuff out of SD if you knew what to add.

can16358p · on Aug 28, 2022

Tried it.

While it's a huge win to be open source, I find the results always inferior to Midjourney (and DALL-E).

I tried to generate some artistic results with variety of prompts and Midjourney always won hands down.

But of course, since it's open source, many community tweaks and colab notebooks/forks will probably put it in par with DALL-E by time. But I have trouble imagining Stable Diffusion competing against Midjourney anytime soon: the different is day and night.

tough · on Aug 28, 2022

Mid journey uses SD under the good (as of recently, no?)

can16358p · on Aug 29, 2022

If the beta they introduced briefly for a few days was SD, I had to add that it always, 100% of the time, procuded much inferior results compared to the MJ "v3".

It was so bad that if they'd replaced v3 with it (good that they didn't) I'd probably have stopped using MJ and had cancelled my subscription.

GaggiX · on Aug 29, 2022

The stable diffusion model will replace the v3 model soon as it's actually superior in coherence and details (I don't know how you can actually say otherwise) but the model v3 will still be available as the v2.

can16358p · on Aug 30, 2022

Highly doubt. Thousands of people were shouting how the beta was horrible compared to v3, and I 100% agree.

Even the team acknowledges that beta model isn't for everything and v3 isn't going anywhere soon.

GaggiX · on Aug 30, 2022

"Thousands of people," you're the only person I've seen on the Internet screaming about how horrible the beta was. You can go check out the Midjourney subreddit and you'll see that people really like it. In fact, the beta has since returned. The team is just finetuning the pre- and post-processing pipeline, and the new model will be ready for use.

can16358p · on Aug 30, 2022

Well, I spend hours in the actual Discord and many people complain there. SD-based beta is simply not artistic enough for most folks, turning it into more like DALL-E. It's not surprising that people generally put the good results on the sub, not failed attempts.

GaggiX · on Aug 30, 2022

I was not talking about good results, but about people's reaction. The artistic vibes that the v3 model offers are nothing but clever pre- and post-processing, when the SD-based model comes out it will give you similar vibes.

desindol · on Aug 28, 2022

corysama · on Aug 29, 2022

The row of three pics near the top of the article are from a new model they have in the works that uses SD under the hood. It was available briefly as a test run. MJ’s additional magic over SD gets even better results.

xg15 · on Aug 29, 2022

> But global paradigm shifts aren’t pleasurable for everyone. As I explained in my latest article on AI art, “How Today's AI Art Debate Will Shape the Creative Landscape of the 21st Century,” we’re getting into a situation—now accelerated with the open-source nature of the model—that’s extremely complex. Artists and other creative professionals are raising concerns and not without reason. Many will lose their jobs, unable to compete with the new apps. Companies like OpenAI, Midjourney, and Stability.ai, although superpowered by the work of many creative workers, haven’t retributed them in any way. And AI users are standing on their shoulders, but without asking for permission first.

As I argued there, AI art models like Stable Diffusion pertain to a new category of tools and should be understood with new frameworks of thought adapted to the new realities we’re living in. We can’t simply make analogies or parallelisms with other epochs and expect to be able to explain or predict what it’s going to happen accurately. Some things will be similar and others won’t. We have to treat this impending future as uncharted territory.

I wonder if we'll also talk about "conversations", "complex situations" and "the need to treat this as uncharted territory" when some Copilot/GPT3 successor a few years down the line spits out entire production-ready software stacks off the prompt "like Facebook only better" - using our own code as training data.

trention · on Aug 29, 2022

>off the prompt "like Facebook only better"

This prompt is unspecific to the point of unusableness. Even if this works some day, the spec used will be a lot more detailed, in higher-level pseudocode style.

xg15 · on Aug 29, 2022

True, but as you can see with image generators, they can without any problems work off extremely underspecified prompts, they will just use their own priors to fill the gaps.

There will absolutely be prompt engineering and I agree that actual, serious prompts will be much more specific than that.

I don't think the prompts will necessarily be pseudocode-style. Depending on what trainsets are available, I could imagine we'll have some high-level description of desired features in addition to lots of specifiers which narrow down the specific languages, design patterns, tools etc which should be used in the resulting codebase.

You can already use similar prompts with Copilot today by disguising them as comments.

trention · on Aug 29, 2022

What you're describing is essentially a DSL. We can ruminate on the exact level of details that will be needed but in my opinion by necessity it will be a lot higher than what most commenters in those threads are imagining. Open API specs are already quite complex and verbose and it's not exactly clear to me we can do a lot better when describing APIs, whether web ones or not.

tough · on Aug 28, 2022

This (Stable Diffusion) is relaly a game changer.

I'm just a run of the mill software engineer (mostly webdev).

I never cared about ML, or Data Science,

Had been playing with dall-e the past few weeks after getting beta access, but it's too limited/meh and ran out of credits soon.

Then DreamStudio (SD SaaS) launched to the public, and I was blown away.

Then I tried to run txt2img on my mac, which I did, but it's too cumbersome/slow

Then found out about replicate, which also exposes an API to interact with and run the models.

I've been since then having fun with it, built some scripts with playwright, and doing generative art with stable diffusion, I'm no artist, but it's so much fun, and results so visually pleasing, I cannot not pursue the urge to explore this.

I will be starting an anon account in twitter and try to sell some of my art as NFT's, we'll see where it gets me.

Just ordered a card to get around NSFW filters (They' are nonsense and flag some random stuff)

If you want to try it the easiest way is dreamstudio/replicate.com

https://replicate.com/stability-ai/stable-diffusion

hwers · on Aug 29, 2022

This message brought to you by replicate.com

Buttons840 · on Aug 28, 2022

I saw the word "safety" a few places in the article. What does "safety" mean in this context?

i_like_apis · on Aug 28, 2022

In this context, it’s buzz-killing, pearl-clutching, often woke, nonsense:

… attempting to limit the generation space to omit porn, copyright infringement, violence, racially “unbalanced” content … etc.

ttul · on Aug 28, 2022

You can ask the AI to generate a picture of horrid things, and it will oblige.

emikulic · on Aug 28, 2022

You can generate things the author doesn't like. And since you're doing it on your video card at home, nobody can stop you.

Buttons840 · on Aug 28, 2022

If a computer model could produce the world's best porn, would that be a good or bad thing? Many harmful effects of porn would be amplified, but it would reduce the exploitation of real people in the industry. A moral question society will soon face I think.

acdha · on Aug 29, 2022

I think the bad things would be targeted use of realistic images: for example, imagine the horrible things some people experience in school multiplied by “leaked” photos, or someone’s abusive ex distributing “proof” of their infidelity / unsuitability to have custody, etc. There’s a theory that over time people would stop believing everything they see but there’s still plenty of time for millions of tragedies before that happens, if it ever does. Forensics is going to be a growth industry.

truffdog · on Aug 29, 2022

I really don't understand why people are pretending the licensing makes any difference here.

astrange · on Aug 29, 2022

I, meanwhile, find AI research distasteful because it’s written in Python.

forgingahead · on Aug 29, 2022

Yeah we need Trigger Warnings on any ML repo that uses Python. The trauma that comes from dealing with Python dependency management is really hurtful and non-inclusive.

Gigachad · on Aug 29, 2022

Which I don't think is the end of the world. I can draw horrid things on my iPad in Procreate or even just a pencil on paper. What's new here is ease of access and hyper realism. This is more a problem for fake news than just generating bad/shock images which were already possible.

fernly · on Aug 28, 2022

It isn't going to impress the person in the street until it actually follows your instructions. I tried several times to express "a tall three-legged stool" but even with the "CFG" (how much the image will be like your prompt) at max, it gave me stools with four or ultimately, two legs. Also tried "a four-legged spider" (don't ask) and got first an eight-legged spider, and next, a spider with eight legs, but four of them were blurred. Sure, dumb, pedestrian requests, no imagination, but a five-year-old would quickly get impatient with its inability to follow simple directions.

astrange · on Aug 29, 2022

You could sketch a three legged stool in MSPaint and use img2img.

I think looking at text prompts as an essential part of the technology is very limiting. You could be using a mix of text, storyboards, images of objects to place inside it, sketches of the desired layout, etc… in fact why not replace your game's renderer with it?

jmiskovic · on Aug 29, 2022

> in fact why not replace your game's renderer with it?

What astrange idea, surely it would never work? /winkwink

bogwog · on Aug 29, 2022

At the pace that these things are progressing, I bet we’ll have that kind of control in like a year or less

soueuls · on Aug 31, 2022

I am gonna ask the weird question, but why are we trying to prevent generating CP content?

1. Sharing is publicly is still considered highly illegal. 2. The biggest problem with everything CP related is that to produce this kind of content, real children were used, tortured and often killed.

Should we not allowed the model to generate CP content in order to reduce the number of real children being hurt/killed?

Plus, this of course would not prevent any authorities from tracking who is sharing this kind of content.

nullc · on Aug 28, 2022

It'll be interesting to see what happens when a copyright troll ( https://doctorow.medium.com/a-bug-in-early-creative-commons-... ) realizes that they can acquire the rights to models distributed under these vague-as-fog moral panic licenses, or distribute their own and have people actually use them, and start extracting rents.

These licenses will do little to nothing to stop abuse: The abusers will already conceal their identities because their actions are immoral or even illegal (fraud, harassment, etc). But they create a whole host of new liabilities for the users because the definitions are exceedingly subjective.

It's tremendously important to make these tools actually open. But open with a lurking liability bomb stops short of the goal. While stability.ai may never turn into a troll or sell their rights to one, that isn't necessarily true for the next model that comes around.

fragmede · on Aug 29, 2022

That raises the question. What is the economic licensing for StableDiffusion et al? Can I download it and set it up, and then charge people money to run it?

zmmmmm · on Aug 29, 2022

I am curious about the nature of the output being rasterised bitmaps. I would have expected that it was easier for a model to generate output based on primitives that it learned as geometric shapes with spatial relationships (what does an "arm" look like, etc. as a shape). I would like to know if the model does have a layer that represents these and then it effectively "renders" them as rasterised images, or is it really computing at the level of pixels. So far I have not seen anything other than rasterised pixels.

I guess it matters because most of these images are unusable for further purposes because they can't really be edited and touched up easily to fix up all the flaws or do the final adaptation. Are there any options that generate the images in anything like vector art that would then facilitate the downstream finishing process rather than fully rasterised bitmaps?

dopeboy · on Aug 28, 2022

Not since Shazam have I had a Neo "whoa" moment like this. Some of these generations are incredible.

j7ake · on Aug 29, 2022

Is it possible to generate AI art but also provide a list of citations (maybe weighted) as metadata to help figure which original images most contributed to the generated image?

I feel this would help a lot with giving artists more credit with the AI art outputs.

visarga · on Aug 29, 2022

Yes, very possible and used in some papers to demonstrate originality/copying behaviour.

bottlepalm · on Aug 29, 2022

Very exciting to have a brand new technology that you can see quickly advancing and branching out into all kinds of things every week. I wonder how long this current run is going to last and where it ends up.

DoctorOetker · on Aug 29, 2022

I'm very impressed, the first sentence I tried was something like "a half submerged archimedes screw ship plowing through sea ice". It had some pretty good ideas of what it might look like.

My only gripe is the usual one in the AI field: very sloppy nomenclature.

Reading about diffusion models I first expected a novel parametrized family of functions, otherwise known as an "architecture".

Instead it seems more like a training method, so a nomenclature of "diffusion training" would seem more apt.

empiricus · on Aug 28, 2022

I tried searching "nude" prompt. Instant regret.

metadat · on Aug 28, 2022

Definitely don't search "nipples".. ugh...

https://lexica.art/

(Direct link doesn't work, count yourself lucky)

artificial · on Aug 29, 2022

Chubby vikings ahoy!

desindol · on Aug 28, 2022

The only point I am bit skeptical is that large language models are just not able to run on consumer hardware so the point with OpenAI is a bit moot.

jmiskovic · on Aug 29, 2022

The model runs on just about anything, it's just a question of how fast. My Intel i7 CPU can use OpenVINO to generate a standard image in about 25 minutes. The M1 can do it in ~45 seconds. For comparison, modern GPUs take about 10 seconds.

wdb · on Aug 29, 2022

Anyone aware of an AI / art model that allows to convert pixel art into vector drawings? A nice open-source model for that would be really handy :)

Still fascinating how these DALE-E and Stable Diffusion models work though

GaggiX · on Aug 29, 2022

You can use the img2img script in the stable diffusion repo to convert your pixel art into vector drawings

veridies · on Aug 28, 2022

One oddity for me (and I haven't played with a lot of AI art, so maybe this is normal): every time I try to describe a person, it generates like four to seven different faces.

amelius · on Aug 28, 2022

Where can I learn more about how this algorithm works?

r2_pilot · on Aug 28, 2022

This may be of interest to you: https://github.com/CompVis/stable-diffusion

dqpb · on Aug 29, 2022

Stability.ai is the industry leader in open AI.

ilaksh · on Aug 29, 2022

I need something like stable diffusion but with transparent backgrounds. Does anyone know how I might do that?

xrd · on Aug 29, 2022

ImageMagick has millions of tools you can use to convert the images. You could script whatever you need using that tool after the AI is done.

ilaksh · on Aug 29, 2022

I use ImageMagick a lot for basic things but there is no way it can remove an arbitrary background. That's an AI task.

GaggiX · on Aug 29, 2022

There are numerous neural networks for image matting.

pvillano · on Aug 31, 2022

ha! Generative art is an image search engine with fancy interpolation. Would it be tractible to find a list of nearest training examples? Then you could cite the stolen art. Imagine that as a Twitter bot.

xwdv · on Aug 29, 2022

I think these AI art tools are great for finally enabling the masses to unleash their creativity without having to have true art skills. It’s like the equivalent of “no-code” style platforms, except because this is art, we can be a lot more forgiving if the results aren’t perfect. No need for “artists” to have a monopoly on artwork, we’re all artists.