Yeah I think that's a fair critique. It kind of looks like a bad cut-and-replace job (if you zoom in you can even see part of the neck is missing). I might give it some more attempts to see if it can do a better job.
I agree that Seedream could definitely be called out as a fail since it might just be a trick of perspective.
That's not a bad suggestion. I thought about adding a numerical score but it felt like it was bit overwhelming at the time. Maybe I should revisit it though in the form of:
I agree with this, some of those are "passing" and others are really passing. Specially with how much better some of the new model is compared to old ones.
I think the paws one is a good example where I think the new model got 100% while the other was more like 75%
I agree that Seedream could definitely be called out as a fail since it might just be a trick of perspective.