Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here is my perspective on these kinds of images. This kind of 'picture' usually comes from speed-painters who incorporate techniques like photobashing. As in integrating 3D models and RL photos into their composition, or just painting over a 3D picture entirely.

It was already a genre that highly incorporated computer assisted methods. There is a lot of doom and glooming going around, but honestly the modern process of creating 'concept art' was already extremely commodified and efficient. These weren't exactly your idealized vision of some artisan craftsman laboring weeks over a picture, they churned this stuff out in a few hours (if that)



I think you're not grasping the magnitude of the change. Creating even an average quality speed-painting requires tremendous amount of expertise in drawing, painting techniques, composition, lighting, perspective. It takes years of training.

These models let anyone achieve similar results in minutes. Without any prior learning. It is not even lowering the bar, it is literally dropping the bar to the ground.

Besides, stable diffusion is able to generate not only painterly scenery, but also photographic images that are almost indistinguishable from actual pictures (certainly helped by the fact images have a heavy digital look in our era).


I agree. People aren't grasping the magnitude because they're thinking about jobs. Jobs a silly way to measure this. Jobs are temporary. Nobody worries about the mechanical stocking frame making socks anymore.

This is more like the literacy/printing press transition.

Used to be, people had to learn to memorize a lifetime of stories and lore. Now nobody learns to make a memory palace or form a mnemonic couplet. Why would you bother? You just write things down.

Today, people learn to draw. In a generation, why would you bother?

There will still be specialist jobs for people generating images, but instead of learning to make them up, the specialists will be very good at picking them, suggesting them, consuming them.

Humans will be the managers and the editors, not the creators.

The same thing will happen to other arts. First (and very easily) to music. Eventually, perhaps, to writing and whole movies...

The only thing stopping that is that the models can't maintain a reality between frames. They can't make an arc. It's all dreamlike.

If we find a way to nail object persistence it will be a singularity-level event. The moment you can say "make another version of this movie, but I want Edgar to be more sarcastic and Lisa should break up with him in the second act" we will close the feedback loop.

It's a lot bigger than "lost jobs".


I mean it sounds pretty cool to be able to fork a film and create different iterations and mashups. Maybe if you create a cool enough scene the director will merge your PR back in.


> It's a lot bigger than "lost jobs".

I agree. It is more than just "lost jobs", like artist impressionists, court room sketch artists, etc. it is a complete dystopia and it doesn't help artists at all, but displaces them. At least the value of actual paintings will be more valuable that the abundance of this highly generated digital rubbish.

So given that the technologists have so-called 'democratized' and cheapened digital art, I really can't wait until we get an open-source version of Copilot AI that would create full programs, apps, full stack websites with no-code so that we would be seeing very cheap Co-pilot AI shops in the south east of the world generating software that effectively eliminates the need for a senior full stack engineer.

Easy cheap business solution for the majority of engineering managers on a tight budget who know they need to offshore tech jobs without the need for any skill as it is offloaded to cheap Copilot prompters.

So we will have no problems with that and be happy with that dystopia. Wouldn't we?


People can already use websites to create simple websites but that hasn't really displaced web developers because our needs keep changing. But it has definitely help people bring their businesses to market much easier. You don't need a whole lot of IT knowledge anymore to start an online clothing business. But that definitely means jobs have been displaced, though in reality, they tend to move rather than simply disappear.

Being an artist isn't really being able to draw well, it is able to do a lot more than that in harmony, and so I believe these tools will just get incorporated and some new artists will appear and older artists will adapt.

My only worry with this, and it's not something that I see being pointed out too much. Is that due to these models being able to produce art from previous art they've seen we might find it difficult to come up with new novel styles. But then again, this might precisely be a new kind of avenue for human artist expression.


Not disagreeing, just wanted to point out that this is already happening in some niches. It used to be that you had to hire someone to make you a webpage, and they had to use PHP or whatever. Then came WP and themes - and you had your page made by some youngster for peanuts.

But I think society will find a way. Who knows, maybe we'll all work less and enjoy life more? One can hope.


Speed painting also usually involves practicing certain scenes. This method anyone can use to create any new scene that they can imagine and the result looks quite good with some patience. Seems like some people are overly pessimistic but to me this seems like we’re on the cusp of something truly disruptive in the arts space. And it’s not NFTs. Remember that last year this would have sounded mostly like sci-fi unless you were following cutting edge research.

In the realm or “real” art I’m actually very excited since I believe there are hundreds of very imaginative and patient people who just can’t paint well but will be able to create new art with tools like this. It can also synthesize new and alien things.


> This method anyone can use to create any new scene that they can imagine and the result looks quite good with some patience. Seems like some people are overly pessimistic but to me this seems like we’re on the cusp of something truly disruptive in the arts space.

A race to the bottom and the cheapening of 'art' in general for the sake of replacing artists is a shame to see and nothing to celebrate. I was against both the gatekeeping of GPT-3 and DALL-E by Open 'faux' AI. But now it seems that every-time an open-source alternative or version was released into the wild, it seems that the uses become even more dystopian; especially with DeepFakes, fake news propaganda / articles and catfishing with generated hyperrealistic faces.

> And it’s not NFTs. Remember that last year this would have sounded mostly like sci-fi unless you were following cutting edge research.

Stable Diffusion is the reason why JPEG NFTs will always be worthless. Both of them will fuel JPEG NFT prices to the floor value of zero. But as NFT proponents cheered in believing that they will help artists, here we are seeing DALL-E 2 and Stable Diffusion fans screaming that it will help artists. No it will not.

> In the realm or “real” art I’m actually very excited since I believe there are hundreds of very imaginative and patient people who just can’t paint well but will be able to create new art with tools like this. It can also synthesize new and alien things.

This isn't the 'democratization of digital art', it is the complete devaluation and displacement of digital artists and it now makes 'real art paintings' much valuable.

A dystopian creation.


> I was against both the gatekeeping of GPT-3 and DALL-E by Open 'faux' AI. But now it seems that every-time an open-source alternative or version was released into the wild, it seems that the uses become even more dystopian;

So are you still against gatekeeping? Are you in favor of releasing AI advances to the wild?


I am still against OpenAI’s gatekeeping and gave AI itself a chance to be more used for good and significantly less dystopian.

Even with the release of GPT-3, there seems to be very little good in such a system despite it being generally underwhelming at generating convincing sentences. However with DALL-E 2, it has gotten much better for worse on digital images, to the point where even gatekeeping it would spur on an open source competitor superseding DALL-E 2 anyway.

But it was actually after the release of Stable Diffusion that done it for me when most here hyping just want to aid the race to the bottom and at the same time are screaming that it will help artists when (like NFTs) it won’t.

So looking at both DALL-E and Stable Diffusion, it is yet another contribution that advances the dystopian AI industry which will just be used for fake news, surveillance and catfishing. Worse part is that they haven't built any detectors for this.


> So looking at both DALL-E and Stable Diffusion, it is yet another contribution that advances the dystopian AI industry

Given that there are more powerful models that have already been developed, should they be gatekeeped or released?


Rather than ignoring the several conditions I mentioned, Read again on what I said:

> I am still against OpenAI’s gatekeeping and gave AI itself a chance to be more used for good and significantly less dystopian.

> Worse part is that they haven't built any detectors for this.

So it is neither. If a given AI project has no detectors or a straight indicator of knowing that it is generated by an AI, then the whole project should be effectively scrapped and cancelled, postponed, etc until it has one. It is that simple. And No. DALL-E 2's tiny watermark doesn't count.

'AI researchers' know the dystopian scam that they are creating and they know that they need detectors and analyzers for them to significantly reduce the risk of malicious use. So it doesn't matter if there are others that are more powerful as the conditions are still the same.


> it is literally dropping the bar to the ground

I think you're directionally correct, but overstating the case in a few ways.

One, as a not particularly visual person, even this example involves some skills of composition and perspective. If you asked me to do something practical, like creating an illustration to go at the top of a blog post, I would not do nearly as well as somebody with art skills, and I would take a lot longer.

Two, this is the beginning. In the same way that digital artists took tools I could use and got really good at them, I expect the same will happen here. What will a good artist be able to do with a solid workflow and a few years of picking up tricks? Given the opaqueness and quirkiness of models, I expect a person who puts in the time, especially one with a strong command of art styles, composition, and the practical uses of visuals, will be able to run rings around me.

Three, people are quite accepting of AI images right now, but they're novel and exciting and decontextualized from how we normally use images. That's a playing field that advantages the novice. But what happens once these images are no longer fun and novel, but boring and overdone? As we learn to discern novice-grade work from what real artists can do with AI assistance, I think our bar as image consumers will rise.


Spending a lifetime learning img2img and using weeks to create a single artwork will always beat someone without experience who creates an artwork in an hour. There will be only a handful of people who will put in the time to become true masters of img2img. Everything comes always down to how much actual physical time someone is willing to put it. No matter how advanced the tools become, there's always a learning curve to mastery, which only a few people are willing or able to climb.


But it didn't drop the bar on the ground, it raised a new bar. People without computer literacy and/or basic programming skills won't be able to pull this off. Even using Photoshop (which I believe does/will integrate this new technology) is not easy/possible for some who can actually draw. Plus, how many regular people have access to the machine with 12GB of VRAM?


The method shown in this demo was already simple enough to teach someone to do in an afternoon. But we're only a week into the release of SD and we haven't gotten to all the sure-to-come GUIs that will pack the model into an idiot-proof application.

Think about it, give the user a few basic, MS-Paint level pencil tools, colors, shape makers. Ask for a description, the application can even push you in the right direction for putting together good, detailed prompts, gives you a list of art-styles, artists, filtering methods, etc all with reference images so you don't need to memorize names. You can zoom into sections of the image to work on independently (like the birds in the article), then blend it into the greater image. Drag and drop image files onto the project and iterate on them.

Implementing the glue to simplify the "tough" parts of this process is honestly pretty trivial.


The parts that are inaccessible right now seem incredibly easy to overcome.

Using a CLI-based tool is inaccessible for most people... but building a GUI around this would be very easy. I'm too lazy to google it, but I would bet someone already has a GUI, or is working on one.

12GB of VRAM may not be accessible on most computers, but there's nothing innovative about offloading that task to an EC2 instance. It just requires an opportunistic developer to tie the pieces together.

I would be monumentally surprised if Figma/Canva/InVision/Adobe are not already working on this.


If you're on windows and have a GPU, there's a GUI you can install. https://www.reddit.com/r/StableDiffusion/comments/x1hp4u/my_...

There's a WebUI with a docker container if you're on Linux w/ GPU; https://github.com/AbdBarho/stable-diffusion-webui and https://github.com/AbdBarho/stable-diffusion-webui-docker.

If you don't have a GPU, there's a Colab UI (Google hosted GPU). https://github.com/pinilpypinilpy/sd-webui-colab-simplified


> Using a CLI-based tool is inaccessible for most people...

CLI-based tools are perfectly accessible to most people.

They just can't be arsed to learn them, unless they need to. And most of the time, they don't, because good-enough alternatives exist.

If a CLI-based tool is the only way that an average person can get their work done, that's what they'll use.


> Plus, how many regular people have access to the machine with 12GB of VRAM?

Probably not many in general, but the RTX 3060 has 12GB or ram and it is around $350. And I saw a RTX 2060 12GB for $250 the other day. That's a pretty reasonable entry fee IMO.


Much less than what photoshop costed back in the day.


Few have those high end cards, but they don't need to anymore. Huggingface is saying it needs 12gb, but the source was forked with some smart mods to chunk the loading on to the GPU.

Itll comfortably run on 6gb now. gtx1600 series cards need to run in full precision mode to produce output. The HLKY fork has improved the Gradio GUI and integrated realesrgan and gfpgan for those with beefier cards.

Someone else also figured out how to load and run it all on a CPU, so pretty much anyone can in theory run the model now.

There is an elaborate Colab notebook linked in the HLKY repo that seems to get more point and click user friendly every time i look at it. I think it even launches the gradio webui so you can use the Colab instance with a webui remotely.


You can run the model on your CPU quite easily, and a lot of people have access to 16gb machines - it's much slower, for sure (10minutes/50passes on my old gaming pc), but it's still much faster than drawing things of the same quality by hand.


It dropped the bar to the other side of the planet. There are so many people computer literate that can pull this off. You could pick 5 people off the street that can follow these instructions and 3 of them would. VS the old way, you would have to pull off 400 people off the street and you probably wouldn't get this result unless you got really lucky.


Pick 5 people at random you get one who doesn't know what a mouse is, 2 or 3 who can turn on the computer and maybe one who can get to the cli. Out of 400 people you will find more natural artists compared to someone who could install this even if they had the equipment


Let me update your heuristics on this. Computer mice are practically obsolete. People use cell phones, not computers. No one needs a cli to run stable diffusion because a mobile web interface was released on day one. 6.6 billion people have a smartphone which is 83% of the world’s population (including the infants). This is about the same number of people who are literate.

4 out of 5 people globally would be able to submit a stable diffusion prompt and view a result. Most would have no idea what the hell was going on or even why it was interesting.


> Most would have no idea what the hell was going on or even why it was interesting.

This is the funniest part to me, because so many people already think this is how digital art worked to begin with.


There are already web sites that will run the underlying models, requiring no installation.

The neat new applications that have taken over this site for the last couple days sometimes require CLI steps to install because they are in active development and it can be easier to experiment with something local. I'm sure they'll either be moved online or wrapped in nice installers over the next couple weeks.


They can use it from their phone or tablet.


> People without computer literacy and/or basic programming skills won't be able to pull this off.

Yet. This is a huge leap forward, to get more basic prompts generating things will be a much smaller leap IMO.


it just got faster. what's the drama about?


I am talentless and untrained. Now with a combo of prompts and img2img, I can create awesome results on any topic and in any style that I have the rights to use. That’s a 0 to 1 moment. It didn’t get faster for me, it went from impossible to possible.


"Any style" seems like an enormous stretch. There definitely seems to be some styles that AI favors, ones which I've seen described as "clutter the frame so you don't notice the flaws". It struggles with simpler styles. I have yet to see a flat black & white image generated by an AI that looked even passable.


How about this one? https://static.simonwillison.net/static/2022/dall-e-pencil-s...

From https://simonwillison.net/2022/Jun/23/dall-e/

Have you tried DALL-E or Stable Diffusion yet? I bet you could generate a black and white image that met your standards for being impressive, if you spent a few minutes on it.

You can try Stable Diffusion free here: https://beta.dreamstudio.ai/


Nah, that's not it. I mean flat #000000/#ffffff. Google "stencil image" and you'll see what I mean.

AI really doesn't handle styles with restrictions like that well. I tried the stable diffusion website with variations on "black and white silhouette stencil image of a cat". It kept wanting to give the cat colored eyes, or it used shading, or the cat didn't have a coherent anatomy, along with the typical AI art "duds" that aren't really anything at all.

To be fair, I did get a couple of passable results when I replaced 'cat' with 'dog'. They were simple, but didn't have any obvious errors.

To be fair in the other direction, replacing 'cat' with 'abacus' gave me an (admittedly pretty) grid of numbers and some chainmail, and 'helicopter' suggested a novel design where two helicopter bodies would be stacked vertically, connected by a vertical shaft through the rotor, and which turned into a palm tree trunk above the top unit. Once you get out of the sample data, it starts to fall apart.

I feel like other people here are willing to forgive more errors than I am. They see an incoherent splotch in an image and assume more development can iron out all the problems, and I see a unavoidable artifact of the fact that these systems don't have a real understanding of what they're making.


Is this not an acceptable result to you? Did this on my first try and to my eyes it’s the same thing as I see when I googled “stencil image.” I’m thinking you have just not tried these prompts enough.

https://ibb.co/k5hMWZr

Edit:

I gave this another shot to see if I could get a more complex stencil. This was my very first try again, so truly not cherry picked. Prompt was: “Stencil image of a tiger face. Clip art. Vector art.” This looks like an infinite stencil making machine to me

https://ibb.co/mNdsms8


That first one just sucks. I don't have anything else to say about it.

The second one is representative of the upper end of what I was getting. It's almost passable, but doesn't hold past about five seconds. The left and right half don't look like they belong to the same animal. The blank space in the middle of the face is huge and detracts from any sense of structure, and the whole mouth area is just odd.

It's seriously impressive for AI, but it's not end-of-artists type stuff. I can google "tiger black and white stencil" and get a bunch of tiger faces, and every one of them is noticeably better. People imagine there are plenty of art jobs where discrepancies like this don't matter, but there really aren't.


Yes, that tiger is shit. Here are 35 tiger stencils I generated over a few minutes with Stable Diffusion on my PC with a humble RTX 2070s gpu: https://i.imgur.com/y9oCZIz.jpg

I could run pretty much any of these through adobe illustrators auto trace and end up with an amazing vector image.

I could also leave it generating these for an hour and I'd have over 1000 results to choose from.


I do feel like you just moved the goalposts on me from AI can’t produce this style at all, to can’t produce this style on a level with an experienced human artist. I don’t think anyone is claiming it beats a good human artist. That does not make it useless for stencils.

Again those were my first try and I know nothing about stencil beyond what 2 seconds on Google images could give me. Certainly better than I could produce if you gave me Adobe Illustrator and a weekend. And the image is mine to use as I please, unlike what I could rip off Google Images.

Also, I thought the cat was cute, but there’s really no accounting for taste. Here’s a silly and swirly cat that might be more your thing? This was a cherry picked one of 10 since you have high standards;)

https://ibb.co/BytvVVF


I considered qualifying “any” but decided it required no qualification. I don’t know how many examples are needed in the training data for a given artist to be able to reproduce that artist, but given how many obscure artists I have seen Dall-e and Stable Diffusion able to recognize, it must not require that many examples. And it’s still possible to fine tune a model with additional training if a new artist comes along or you want a bit more capability with a rare style.

So yeah ANY style, I’m pretty sure of it.


I think it will be interesting to see how this comment has aged in five years.


There’s nothing a modern computer does compared to 30 years ago other than being “faster”. It opens up everything.


Technically true, but I think this is VASTLY understating what has become possible with your average PC over the past 30 years.

Today I can get quick, effortless renders from Blender with a zillion available assets on the internet on my laptop. I can drop that directly into something like Clip Studio and paint right over it.

In the 80s you needed an extraordinarily expensive workstation like the Quantel Paintbox to even do primitive Photoshop type stuff. If you wanted a 3D render you needed a whole farm of servers.


That seems like an overreach. Thirty years ago, object recognition basically didn’t work, except in extremely simple cases. Something like semantic segmentation would have been way out of reach. Computers couldn’t play Go effectively against even modestly skilled human amateurs.


I meant the very technical sense where you could take an object recognition algorithm and compile it to run on a 80386 and it would run fine although slow to the point of not being practical. Computers brought us more speed (and memory) to enable new classes of uses, but there’s not a single intrinsic operation a modern computer does which an old one can’t replicate.

So quantity is indeed a quality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: