Nice approach! The latter part of that workflow is where we are focused now, i.e. now that we can plan and recommend B-tree indexes for individual queries, how do we utilize this to recommend a set of indexes for a whole database (taking into account different predicate variations).
On open-sourcing this code: Not at this moment - we do however offer a WASM build of this logic for free use at https://pganalyze.com/index-advisor
> The latter part of that workflow is where we are focused now, i.e. now that we can plan and recommend B-tree indexes for individual queries, how do we utilize this to recommend a set of indexes for a whole database
what I've ended up doing is trying to approximate the number of rows, create dummy data that has a similar spread of data, and create indexes and query until I have a good understanding of exactly how each index behaves with the amount of data. It ends up being a lot of work, but sometimes getting a 50% performance increase is worth that amount of work overall.
> On open-sourcing this code: Not at this moment
completely understandable, the selfish side of me hopes that it will be open sourced eventually, but the practical side of me sees the benefits to your company for keeping it private for now.
those also require real or simulated data to get good recommendations, which is sometimes hard. if you're going to go that far, you might as well create real indexes.
besides, in some environments access to the real data (if there's PII) is much more difficult to get, so simulating is as far as you can go.
* log the queries from the application
* parse the queries into an AST
* collate the ASTs to form the most common query patterns
* cross-reference existing indexes
* create index recommendations with and without predicates
that worked pretty well, up until you're trying to decide which index type is going to be best.
this is pretty darned neat in comparison. I do hope they open-source it like they did pg_query.