This thing's ability to produce entire infographics from a short prompt is *real...

JLO64 · 2025-11-20T23:12:01 1763680321

This is legitimately game changing a feature in my SaaS where customers can generate event flyers. Up until now I had Nano Banana generate just a decorative border and had the actual text be rendered via Pillow controlled by an LLM. The result worked, but didn’t look good.

That said, I wonder if text is only good in small chunks (less than a sentence) or if it can properly render full sentences.

danielbln · 2025-11-21T14:04:20 1763733860

It can render full sentences.

skybrian · 2025-11-20T17:40:17 1763660417

It didn’t do so well at finding middle C on a piano keyboard:

https://gemini.google.com/share/c9af8de05628

I did manage to get one image of a piano keyboard where the black keys were correct, but not consistently.

vunderba · 2025-11-20T18:29:54 1763663394

I've tried similar stuff such as: "Show a piano with an outstretched hand playing a Emaj triad on the E, G#, and B keys".

https://imgur.com/ogPnHcO

Even generating a standard piano with 7 full octaves that are consistent is pretty hard. If you ask it to invert the colors of the naturals and sharps/flats you'll completely break them.

Snuggly73 · 2025-11-21T07:09:49 1763708989

reflection seems slightly wrong as well

gowld · 2025-11-20T19:38:39 1763667519

Fooled me because it was locally correct!

pseudosavant · 2025-11-20T18:38:04 1763663884

It even worked really well at creating an infographic for one of my quirkier projects which doesn't have that much information online (other than its repo).

"An infographic explaining how player.html works (from the player.html project on Github). https://github.com/pseudosavant/player.html"

And then it made one formatted for social: "Change it to be an infographic formatted to fit on Instagram as a 1:1 square image."

bn-l · 2025-11-20T16:44:24 1763657064

Is the infographic accurate in terms of the way datasette wprks?

simonw · 2025-11-20T17:16:18 1763658978

Almost entirely. I called out the one discrepancy in my post:

> “Data Ingestion (Read-Only)” is a bit off.

OtherShrezzing · 2025-11-20T17:07:54 1763658474

It’s subtly incorrect. R/w permissions for example are described incorrectly on some nodes.

mikepurvis · 2025-11-20T17:42:03 1763660523

Then the question becomes, can it incorporate targeted feedback, or is it a oneshot-or-bust affair?

My experience is that ChatGPT is very good at iterating on text (prose, code) but fairly bad at iterating on images. It struggles to integrate small changes, choosing instead to start over from scratch, with wildly different results. Thinking especially here of architectural stuff, where it does a great job laying out furniture in a room, but when I ask it to keep everything the same but change the colour of one piece, it goes completely off the rails.

simonw · 2025-11-20T18:03:38 1763661818

Nano Banana is really good at iterating on images, as shown by the pancake skull example I borrowed from Max Woolf: https://simonwillison.net/2025/Nov/20/nano-banana-pro/#tryin...

I've tried iterating on slides with test on them a bit and it seems to be competent at that too.

spike021 · 2025-11-20T17:55:49 1763661349

I would assume it depends on how it generates the images.

I've used Claude to generate fairly simple icons and launch images for an iOS game and I make sure to have it start with SVG files since those can be defined as code first. This way it's easier to iterate on specific elements of the image (certain shapes need to be moved to a different position, color needs to be changed, text needs an update, etc.).

FWIW not sure how Nano Banana Pro works though.

fzysingularity · 2025-11-20T19:46:43 1763668003

Claude does image generation in surprising ways - we did a small evaluation [1] of different frontier models for image generation and understanding, and Claude is by far the most surprising in results.

[1] https://chat.vlm.run/showdown

[2] https://news.ycombinator.com/item?id=45996392

vunderba · 2025-11-20T18:23:51 1763663031

You can use targeted feedback - but it's on the user to verify whether the edits were completely localized. In my experience NB mostly tends to make relatively surgical edits but if you're not careful it'll introduce other minute changes.

And that point you can either start over or just feather/mask with the original in any Photoshop type application.

gpmcadam · 2025-11-20T17:14:32 1763658872

None of it was accurate.

But boy was it beautiful.

Kiro · 2025-11-20T21:03:48 1763672628

Funny thing to say considering the author of Datasette himself says it's accurate.

cubefox · 2025-11-21T07:49:09 1763711349

It would be great if Google could make SynthID openly available so OpenAI etc could also implement it. Then websites like Facebook, or even local browsers, could implement an "AI warning".

fudged71 · 2025-11-20T17:26:05 1763659565

I’ve been really excited for you infographic generation. Previous models from Google and openAI had very low detail/resolution for these things.

I’ve found in general that the first generation may not be accurate but a few rolls of the dice and you should have enough to pick a style and format that works, which you can iterate on.

nrhrjrjrjtntbt · 2025-11-20T21:40:00 1763674800

Game changer for architecture diagrams.

energy123 · 2025-11-21T01:55:00 1763690100

I'm finding it bad at instruction following for architectural specs (physical not software), where you tell it what goes where, and it ignores you and does some average-ish thing it's seen before. It looks visually appealing though.

ndkap · 2025-11-20T18:56:17 1763664977

Did you check if the SynthID works when you edit the photos with filters like GrayScale?