Hacker Newsnew | past | comments | ask | show | jobs | submit | raldi's commentslogin

Despite the abuse of quotation marks in the screenshot at the top of this link, Dario Amodei did not in fact say those words or any other words with the same meaning.

Yes, unfortunate that people keep perpetuating that misquote. What he actually said was "we are not far from the world—I think we’ll be there in three to six months—where AI is writing 90 percent of the code."

https://www.cfr.org/event/ceo-speaker-series-dario-amodei-an...


Design is about how it works (my phone went 100% —> 84% while reading this, almost certainly thanks to the snow)

It's a shame the author didn't test on mobile, but I think we should cut them some slack. It would be understandable for this particular article's audience to mostly be viewing on desktop.

And now my phone is hot

Would you protest someone who said “Ants want sugar”?


I always protest non sentients experiencing qualia /s


What’s your non-sarcastic answer?


I’m guessing the missing part probably ended with an internal rhyme for “man”. Maybe “missed the can” (she was aiming at)?


Could be, although not that particular completion. The second chorus was rare and I'm kind of unsure about "shot a man". Can't edit previous comment but should have just put it as:

  ? ? ?, ? ? ? ?, in nineteen thirty one. Hey!


I'm surprised that either:

1. Nobody at Waymo thought of this,

2. Somebody did think of it but it wasn't considered important enough to prioritize, or

3. They tried to prep the cars for this and yet they nonetheless failed so badly


Everyone should have understood that driving requires improvisation in the face of uncommon but inevitable bespoke challenges that this generation of AI is not suited for. Either because it's common sense or because so many people have been shouting it for so long.


What improvisation is required? A traffic light being out is a standard problem with a standard solution. It's just a four-way stop.


In many versions of road rules (I don't know about California), having four vehicles stopped at an intersection without one of the four lanes having priority creates a dining philosophers deadlock, where all four vehicles are giving way to others.

Human drivers can use hand signals to resolve it, but self-driven vehicles may struggle, especially if all four lanes happens to have a self-driven vehicle arrive. Potentially if all vehicles are coordinated by the same company, they can centrally coordinate out-of-band to avoid the deadlock. It becomes even more complex if there are a mix of cars coordinated by different companies.


That only works if everyone else also treats it as a four way stop. Which they don't, unfortunately.


Yes, especially in a city like San Francisco where so many cultures come together: the Prius culture, the BMW culture, the Subaru culture, etc.


To be fair 'common sense' and 'many people have been shouting it' about technical matters have a long history of being hilariously wrong. Like claims that trains would cause organ damage to their riders from going at the blistering speed of either 35 or 50 mph, IIRC. Or about manned flight being impossible. Common sense would tell you that launching a bunch of broadcasting precise clocks into orbit wouldn't be usable to determine the distance, and yet here we are with GPS.


I'd say driving only requires not to handle uncommon situation dangerously. And stopping when you can't handle something fits my criteria.

Also I'm not sure it's entirely AI's fault. What do you do when you realistically have to break some rules? Like here, I assume you'd have to cut someone off if you don't want to wait forever. Who's gonna build a car that breaks rules sometimes, and what regulator will approve it?


If you are driving a car on a public street and your solution to getting confused is stopping your car in the middle of the road wherever this confusion happens to arise, and sitting there for however long you are confused, you should not be driving a car in the first place. That includes AI cars.


In practice, no one treats it as a four-way stop, which makes it dangerous to treat it as one.


Drove through SF this evening. Most people treated it as a four-way stop! I was generally impressed.


But a citywide blackout isn’t that uncommon.


> But a citywide blackout isn’t that uncommon.

I think too many people talk past each other when they use the word common, especially when talking about car trips.

A blackout (doesn't have to be citywide) may not be periodic but it's certainly frequent with a frequency above 1 per year.

Many people say "common" meaning "frequent", and many people say "common" meaning "periodic".


Even among people who mean "common" as in "frequent", they aren't necessarily talking about the same frequency. That's why online communication is tricky!


I think your power company needs to be replaced if the frequency is above 1 per year.


It isn't? To me that's the main problem here, as this should be an exceptionally rare occurrence.


I think that statement is regional. I’ve never seen one.


Likely 2. Not something that will make it into in their kpis. No one is getting promoted for mitigating black swan events.


Actually that is specifically not true at Google, and I expect it applies to Waymo also.

People get promoted for running DiTR exercises and addressing the issues that are exposed.

Of course the problem is that you can't DiRT all the various Black Swans.


Clearly cars can navigate themselves, it's the lack of remote ops that halted everything


Exactly: When you're building software, it has lots of defects (and, thus, error logging). When it's mature, it should have few defects, and thus few error logs, and each one that remains is a bug that should be fixed.


I don't understand why you seem to think you're disagreeing with the article? If you're producing a lot of error logs because you have bugs that you need to fix then you aren't violating the rule that an error log should mean that something needs to be fixed.


I couldn’t agree more with the article. What made you think I disagreed?


You have an alert on what users actually care about, like the overall success rate. When it goes off, you check the WARNING log and metric dashboard and see that requests are timing out.


That is a lagging indicator. By the time you're alerted, you've already failed by letting users experience an issue.


Well, yes. If the cable falls out of the server (or there's a power outage, or a major DDoS attack, or whatever), your users are going to experience that before you are aware of it. Especially if it's in the middle of the night and you don't have an active night shift.

Expecting arbitrary services to be able to deal with absolutely any kind of failure in such a way that users never notice is deeply unrealistic.


It continues to become more realistic with the passing of time.


What alternative would you propose? Page the oncall whenever there's a single query timeout?


the alternative i propose is have deep understanding of your system before popping off with dumb one size fits all rules that don't make sense.


Right. If staging or the canary is logging errors, you block/abort the deploy. If it’s logging warnings, that’s normal.


Unless it is logging more warnings because your new code is failing somehow; maybe it stopped parsing the reply correctly from a "is this request rate limited" service so it is only returning 429 to callers never accepting work.


Yes. Examples of non-defects that should not be in the ERROR loglevel:

* Database timeout (the database is owned by a separate oncall rotation that has alerts when this happens)

* ISE in downstream service (return HTTP 5xx and increment a metric but don’t emit an error log)

* Network error

* Downstream service overloaded

* Invalid request

Basically, when you make a request to another service and get back a status code, your handler should look like:

    logfunc = logger.error if 400 <= status <= 499 and status != 429 else logger.warning
(Unless you have an SLO with the service about how often you’re allowed to hit it and they only send 429 when you’re over, which is how it’s supposed to work but sadly rare.)


> Database timeout (the database is owned by a separate oncall rotation that has alerts when this happens)

So people writing software are supposed to guess how your organization assigns responsibilities internally? And you're sure that the database timeout always happens because there's something wrong with the database, and never because something is wrong on your end?


No; I’m not understanding your point about guessing. Could you restate?

As for queries that time out, that should definitely be a metric, but not pollute the error loglevel, especially if it’s something that happens at some noisy rate all the time.


I think OP is making two separate but related points, a general point and a specific point. Both involve guessing something that the error-handling code, on the spot, might not know.

1. When I personally see database timeouts at work it's rarely the database's fault, 99 times out of 100 it's the caller's fault for their crappy query; they should have looked at the query plan before deploying it. How is the error-handling code supposed to know? I log timeouts (that still fail after retry) as errors so someone looks at it and we get a stack trace leading me to the bad query. The database itself tracks timeout metrics but the log is much more immediately useful: it takes me straight to the scene of the crime. I think this is OP's primary point: in some cases, investigation is required to determine whether it's your service's fault or not, and the error-handling code doesn't have the information to know that.

2. As with exceptions vs. return values in code, the low-level code often doesn't know how the higher-level caller will classify a particular error. A low-level error may or may not be a high-level error; the low-level code can't know that, but the low-level code is the one doing the logging. The low-level logging might even be a third party library. This is particularly tricky when code reuse enters the picture: the same error might be "page the on-call immediately" level for one consumer, but "ignore, this is expected" for another consumer.

I think the more general point (that you should avoid logging errors for things that aren't your service's fault) stands. It's just tricky in some cases.


Also everywhere I have worked there are transient network glitches from time to time. Timeout can often be caused by these.


> the database is owned by a separate oncall rotation

Not OP, but this part hits the same for me.

In the case your client app is killing the DB through too many calls (e.g. your cache is not working) you should be able to detect it and react, without waiting for the DB team to come to you after they investigated the whole thing.

But you can't know in advance if the DB connection errors are your fault or not, so logging it to cover the worse case scenario (you're the cause) is sensible.


I agree that you should detect this, just through a metric rather than putting DB timeouts in the ERROR loglevel.


But what's the base of your metric ?

I feel you're thinking about system wide downtime with everything timing out consistently, which would be detected by the generic database server vitals and basic logs.

But what if the timeouts are sparse and only 10 or 20% more than usual from the DB POV, but it affects half of your registration services' requests ? You need it logged application side so the aggregation layer has any chance of catching it.

On writing to ERROR or not, the hresholds should be whatever your dev and oncall teams decides. Nobody outside of them will care, I feel it's like arguing which drawer the socks should go.

I was in an org where any single error below CRITICAL was ignored by the oncall team , and everything below that only triggered alerts on aggregation or special conditions. Pragmatically, we ended up slicing it as ERROR=goes to the APM, anything below=no aggregation, just available when a human wants to look at it for whatever reason. I'd expect most orgs to come with that kind of split, where the levels are hooked to processes, and not some base meaning.


> No; I’m not understanding your point about guessing. Could you restate?

In the general case, the person writing the software has no way of knowing that "the database is owned by a separate oncall rotation". That's about your organization chart.

Admittedly, they'd be justified in assuming that somebody is paying attention to the database. On the other hand, they really can't be sure that the database is going to report anything useful to anybody at all, or whether it's going to report the salient details. The database may not even know that the request was ever made. Maybe the requests are timing out because they never got there. And definitely maybe the requests are timing out because you're sending too many of them.

I mean, no, it doesn't make sense to log a million identical messages, but that's rate limiting. It's still an error if you can't access your database, and for all you know it's an error that your admin will have to fix.

As for metrics, I tend to see those as downstream of logs. You compute the metric by counting the log messages. And a metric can't say "this particular query failed". The ideal "database timeout" message should give the exact operation that timed out.


I wish I lived in a world where that worked. Instead, I live in a world where most downstream service issues (including database failures, network routing misconfigurations, giant cloud provider downtime, and more ordinary internal service downtime) are observed in the error logs of consuming services long before they’re detected by the owners of the downstream service … if they ever are.

My rough guess is that 75% of incidents on internal services were only reported by service consumers (humans posting in channels) across everywhere I’ve worked. Of the remaining 25% that were detected by monitoring, the vast majority were detected long after consumers started seeing errors.

All the RCAs and “add more monitoring” sprints in the world can’t add accountability equivalent to “customers start calling you/having tantrums on Twitter within 30sec of a GSO”, in other words.

The corollary is “internal databases/backend services can be more technically important to the proper functioning of your business, but frontends/edge APIs/consumers of those backend services are more observably important by other people. As a result, edge services’ users often provide more valuable telemetry than backend monitoring.”


But everything you’re describing can be done with metrics and alerts; there’s no need to spam the ERROR loglevel.


My point is that just because those problems can be solved with better telemetry doesn’t mean that is actually done in practice. Most organizations do are much more aware of/sensitive to failures upstream/at the edge than they are in backend services. Once you account for alert fatigue, crappy accountability distribution, and organizational pressures, even the places that do this well often backslide over time.

In brief: drivers don’t obey the speed limit and backend service operators don’t prioritize monitoring. Both groups are supposed to do those things, but they don’t and we should assume they won’t change. As a result, it’s a good idea to wear seatbelts and treat downstream failures as urgent errors in the logs of consuming services.


4xx is for invalid requests. You wouldn't log a 404 as an error


I’m talking about codes you receive from services you call out to.


What if user sends some sort of auth token or other type of data that you yourself can't validate and third party gives you 4xx for it? You won't know ahead of time whether that token or data is valid, only after making a request to the third party.


Oh that makes sense.


There are still some special cases, because 404 is used for both “There’s no endpoint with that name” and “There’s no record with the ID you tried to look up.”


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: