I'm not sure how testing timeouts is trivial compared to cancellation. They both take about the same amount of code to write a test for, IME. (Not much.)
Not retrying+timeouts has similar effects to cancellation. The operation ceases to go forward. But it is not the same. It's a lot more expensive than imperative cancellation (need to rebuild, resend, reparse the request) and it has a lot of production risks that waiting with cancellation doesn't. For example, naive retries can expose backends to thundering herds, and less naive retries can have strange issues caused by exponential backoff where you'll have requests sitting around doing nothing for half their own timeout, before giving up because the next retry did not hit before the end of the parent request's timeout.
All good points. By trivial I meant 2 tests (works/fails) vs 3+ (works/fails/cancel with the latter possibility having its own works/fails cases). A timeout is just a status code on failure.
In Go, it is a single code path. Contexts can be canceled and they also come with propagating timeouts. The timeouts simply trigger a cancellation, so the only code path is handling cancellation.
There's nothing complicated about it, so there's no reason your code can't implement timeouts and cancellation the same way: timeouts are a cancellation triggered autonomously after some time passes.
By adding that timeout you just created a user-visible behavior that nobody asked for and people will only notice in production while dealing with the most complicated use-cases.
Nobody asks for it but some choice must be made. As a user, I have often cursed things that hang indefinitely. And I don't trust application state after touching a cancel button. That stuff is seldom tested well.
Well, not always possible. For example the latest systemd has a bug where it sometimes deadlocks in a PAM module, so it blocks all remote access to a machine over ssh (openssh uses PAM, optionally). If openssh had a timeout on the PAM child process, it would simply retry after timeout, instead the whole machine is lost and needs to be restarted with physical access.
There's no way to cancel the operation remotely, because you're not authenticated yet. And you may not have any other access.
Timeouts are also a good defense strategy against bugs.
Of course. I did not mean to convey that timeouts should be avoided in all cases. In fact I listed several such cases where they should be used. An API that has no way to cancel would be another example. Although I would argue that such an API is fundamentally flawed.
Right I think the suggestion in that case would be to upgrade to an API that does support cancellation wherever possible. E.g. wait for multiple objects with the original argument and an additional cancel event.