Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Many people are surprised at the advancement of AI, as they thought that generative fields (art, poetry) would be hardest for AI to excel at, whereas “simple” tasks like driving would be easier.

What we are actually seeing is that AI is useful to the degree that being wrong is okay. AI’s mess up, and that’s okay with poetry because a human can quickly read the poems and pick out the good ones. It’s not okay when driving.



It kind of feels that ChatGPT is "just" missing some kind of adversarial self clone that can talk back to itself and spot errors. Most times when I spot an error and mention it it seems that it already had some notion of this problem in the model.


Analysis based on introspection is dangerous, but it definitely feels like I have such an adversarial model. A lot of the time my brain tosses out five or ten ideas before I hit on one without obvious flaws.

It's usually serial, so it's not clear to me that this is using different hardware than I use to generate the ideas in the first place. We might be surprisingly close to human-level AI.

Or not.


> Most times when I spot an error and mention it it seems that it already had some notion of this problem in the model.

By this do you mean that it correctly incorporated what you said and convincingly indicated that it understood the mistake? Because that's not the same thing as it having the truth latently encoded in the model—it just means that it knows how people respond when someone corrects them (which is usually to say "oh, yeah, that's what I meant").


If you ask it open ended questions like:

"what's wrong with this code"

or "list 5 ways this can be improved,"

it does often recognize errors and give reasonable improvement suggestions.


I talked to it about the Turing completeness of PowerPoint. Initially it thought it was impossible, then possible with some scripting language, and then with some prodding I got it to believe you can do it with hyperlinks and animations. Then it gave me an example that I was unable to verify, but was definitely in the ballpark of the actual solution.


For humans, this is science. It's hard, time-consuming, often expensive and limited, involves ethics and consent, and gets a lot of wrong answers anyway.

So I guess we need to get to the point where you give an AI a prompt without a known answer, and instead of confidently spewing bullshit, it can propose a research study, get it funded, find subjects, and complete the study.

Of course, there are easy forms of "science." I just asked it "Is it cloudy in Dallas?" It answered:

It can be cloudy in Dallas. The city experiences both sunny and cloudy days throughout the year.

That is true, but not an answer. It is cloudy and I can conduct the very simple experiment of looking at the sky to get that answer.


However is this not a known limitation with how it's currently setup? It has no way of knowing what the current weather in Dallas is simply because it has no way of finding out (eg it could query a weather website, but it has no internet browsing yet)... to be an accurate comparison you'd need to repeat your simple experiment blindfolded.


How would you make it navigate an api? We don't even know how to make it perform basic arithmetic's correctly, it is an enormous black box model so we can't just inject code into.


For JavaScript they could build in a "test that code" step right into the web page. Or such a thing will be possible with the API.


> For JavaScript they could build in a "test that code" step right into the web page. Or such a thing will be possible with the API.

This is definitely how we wind up with SkyNet...


Or automate the process even further; let the AI run the code, feed back the output (including errors), and ask "Does this output look correct? If not, please refine the code to fix any problems." Repeat until the AI says "LGTM".


Or let us provide unit tests for generated code and automatically tell the AI to fix the code until the code passes the tests we provided.


Or tell the AI to write the unit tests itself, and let it run those to check its work. After all, that's how humans write code. We aren't so great at writing bug-free code on our first try without testing it either.


I had ChatGPT generate a table of 10 test cases for a Jewish calendar to Gregorian calendar conversion function. None of them were valid.


Or just let train the model until it incorporates JavaScript semantics well. Then you don't need to maintain a bunch of integrations.


"<Insert jailbreak stuff here>Please provide code that successfully hacks into FBI headquarters."


It's so weird to me that people pretend that Waymo and Cruise don't exist. And with the recent wider release of Tesla's "FSD", if it wasn't handling 99% of it properly you would hear about collisions all over the place. And I know there have been some here and there before. But this is a massive deployment of AI. If it doesn't work there are going to be a LOT of crashes. But there are not. Because it works.


No in Tesla's case it is because a person still has to be on the wheel to correct any potential errors.


You think if it was failing any significant amount to drive in a safe way, with this wide deployment, there wouldn't be a lot of crashes? Having an AI drive is the best way to make the driver zone out. At this point, it usually fails by going into intersections a little too slowly.


There's entire YouTube channels dedicated to videos of Tesla Self Driving doing stupid things like trying to turn into a tram. At scale, 99.99% correct will still kill many thousands of people. Compared to the sheer volume of cars, there aren't actually that many Teslas out there.


I enjoy videos on the self driving space and Tesla's technical (not business) approach to it. Its produced results that were actually quite a bit better than I expected at this stage.

I still regularly see videos of Tesla's beta software attempting to pull out into traffic in situations that clearly could have very bad outcomes. I still see so much phantom braking that its a collision risk.

I wouldn't call it dangerous, in the sense that it is done well enough that the person at the wheel should be able to handle it, but it'd crash a lot without an attentive driver.

Its a long way from 99% reliability at this point.


The driver won't zone out if it is only getting it correct a tiny fraction of the time.


and conveniently ignoring that Cruise and Waymo have been running around for months in San Francisco with fingers crossed no major incident


Waymo has been running in Phoenix for years.


Let me know when it scales.


So, quality control will still be performed by human experts in the near future.

I don't see why we should all panic about the future outlook of the human race in light of these groundbreaking developments.


The number of workers needed for quality control is much lower than the number for actual work. It is not unreasonable to fear the loss of jobs to AI.


Yes. Another option way I view it is that AI is good where humans make a lot of mistakes too.


so much of the modern economy is basic economic messaging "wrapped" in storytelling. The media is an enormous industry that (apart from news) is a few basic stories wrapped in elaborate ornamentation.


I completely agree with this comment. It is surprising to see how well AI is able to perform in fields like art and poetry, which require creativity and nuance, especially when compared to tasks like driving which may seem more straightforward. However, this shows that AI is most effective when it is able to make mistakes and learn from them. In the case of poetry, a human can easily sift through the generated poems and pick out the good ones, but in the case of driving, mistakes can have serious consequences. This highlights the importance of considering the potential risks and limitations of AI when applying it to various tasks.


I'm assuming this comment was written by ChatGPT, am I correct? It's got quite a predictable writing style, and also your comment doesn't add anything to the original, it's just reworded


Interesting, I wonder if people will modify how they write to avoid appearing as an AI.


Most people don't start a reply by restating the thing they're replying to. It's a dead giveaway of this particular model.

Good writers also don't waste time with fluffy platitudes that convey little meaning or merely state the obvious.


How about this instead from ChatGPT?

"Indeed, AI's ability to make mistakes is what makes it so useful in generative fields like art and poetry, while its lack of mistakes is what makes it essential for tasks like driving."


yes :)


I don't think this deserves a smiley face. I'm hoping for a rapid crackdown on ML-generated comments either via explicit rules or cultural norms.


Oh man, now I'll have to always keep in my head the notion that I might be reading AI generated comments.


Do you think this could actually be more of an indictment of how derivative and formulaic most "art" is? I don't think writing a poem requires any less perfection, just that we're more accepting of shitty poetry than we are terrible drivers.


Thousands die of car accidents daily. We accept shitty drivers too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: