This doesn't negate the fact that I stated. You can throw more hardware at the problem to reach higher throughput if your problem domain allows for it (such as the use case you link to). It goes without saying that this is an inefficient approach, not to mention that this won't apply if you're running batch jobs for instance where you need high throughput (e.g. on individual nodes).
> How We Built Uber Engineering’s Highest Query per Second Service Using Go
https://eng.uber.com/go-geofence/
And that was in 2016. Go's GC performance characteristics improved quite a bit since then.