It’s very common for scale-out architectures to read more data than is ultimately returned, because the former is pulled from individual shards and then some centralized filtering / post processing is applied in some API middleware layer.
Trying to fix that by pushing down more of the query/execution is sometimes but not always feasible or practical.
(responding to GP) “Full table scans are expensive” is a better way to reframe the pricing model (at least before this pricing change). The distinction is between rows _examined_ vs rows _returned to a db client_. Even the latter is a tricky concept with vitess since it’s a routing/sharding system that sits on top of vanilla MySQL instances.
It’s very common for scale-out architectures to read more data than is ultimately returned, because the former is pulled from individual shards and then some centralized filtering / post processing is applied in some API middleware layer.
Trying to fix that by pushing down more of the query/execution is sometimes but not always feasible or practical.