Fun fact Apt pupil has a reference to Shawshank where the main character says he lives off stocks that a banker setup named Dufresne who went to prison for murdering his wife.
King does this all the time in his stories having character connections across different novels, making them set in the same universe. Fun, adds some depth to all of it. Like Randal Flagg being the same villain in the Stand and the Dark tower and Eyes of the Dragon.
This is great, I have harped on PG's aversion to plan hints quite a bit in the past (and also lack of plan caching and reuse).
One of the biggest issues I run into operationally with relational db's is lack of plan stability and inability as a developer to tell it what I know is the right thing to do always in my application.
I use MS SQL Server more than PG currently and if it did not have plan hinting it would have been catastrophic for our customers.
Its still not perfect and over the years I am starting to think that having smart optimizers that use live statistics may be more of a detriment than help due to constantly fighting bad plan generation at typically the worst times (late at night in production). At least with compiler optimization it happens at build time and the results can be reasoned about and is stable. SQL is like some insane runtime JIT system like a javascript engine that de-optimizes but not deterministically based on the data in the system at the time.
Using various tricks and writing the query in slightly different ways while praying to the planner god to pick the correct plan is well infuriating due to lack of control over the system.
I much prefer systems like Linq or Kusto where it's still declarative but the pipeline execution order follows the order as written in the code. One of the most helpful and typical hints that solves issues is simply forcing join order and subquery / where order such that I want it to do this filter first then this other one second, then typically the optimizer picks the obviously correct indexes to use. Bad plans typically try and rewrite the order do much less selective thing first destroying performance. I as the developer normally know the correct order of operations for my applications use and write the query that way.
I've actually come around to the Postgres way of thinking. We shouldn't want or need plan hints usually.
Literally every slow Postgres statement I worked on in the last few years was due to lack of accurate statistics, missing indexes, or just badly designed queries. Every one was fixable at the source, by actually fixing the core issue.
This was in stark contrast to the myriads of Oracle queries I also debugged. The larger older ones had accumulated a "crust" of plan hints over the years. Most not so well thought out and not valid anymore. In fact, often just removing all hints made the query faster rather than slower on newer Oracle versions.
It's so tempting to just want to add a plan hint to "fix" the suboptimal query plan. However, the Postgres query planner often has an actual reason for why it does what it does and overall I've found the decisions to be very consistent.
>I've actually come around to the Postgres way of thinking. We shouldn't want or need plan hints usually.
They only come out at night, mostly.
PG is 40 years old and still has planner bugs being fixed up regularly, and having no control and waiting for a new version when a hint could fix the issue at runtime is an obvious problem that should have been addressed long ago.
It's great the devs want to make the planner perfect and strive for that, it is an unattainable goal worth pursuing IMO. Escape hatches are required hence the very popular pg_hint_plan extension.
But in the end after many years of dealing with these things I have come to the opposite conclusion, let the query language drive the plan more directly and senior devs can fix juniors devs mistakes in the apps source code and the plans will be committed in source control for all to see and reference going forward.
SQL comes from an idea of non technical people querying a system in ad-hoc ways, still useful, but if you are technically competent in data structures and programming and making an application that uses the db, the planner just gets in your way at least in my experience.
Would be nice if PG supported clustered indexes (Index Organized Tables in Oracle speak) as an option if you have a table thats accessed mostly the same way you can get a index without the write amplification because the table is the index.
Clustered indexes only save up to 2x write amplification in the very rare case where you're indexing the entire table (e.g. if it has very few columns).
However, that is usually the least of your concerns with write amplification. If you don't batch your writes, you can easily get 100x write amplification. For any primary key or any other index not strongly correlated with your INSERTs, you can get perhaps another 100x write amplification even if you batch you writes.
>in the very rare case where you're indexing the entire table (e.g. if it has very few columns).
Not sure I follow most tables are accessed primarily in one way (primary key) while maybe sometimes in others for analysis. Having the PK written twice because it's almost always indexed is normally a waste and good candidate for a clustered index. So much so that many DB's like SQLite and MySql always do clustered indexes on primary key because their storage engine is built such that tables are a b-tree anyway vs PG that has separate b-tree indexes and heap tables. MSSQL and Oracle give you a choice whether the table is a index structure or a heap.
If you have very specific use case tables they can typically have a clustered index and no secondary indexes, you can still scan them for ad-hoc analysis but you get better insert performance and space usage because you aren't double writing to the heap and a PK index like you would in PG.
As far as batch writes that is a separate issue and has to due with whether that even makes sense for durability, if you need to commit a single random row due to something occurring you can't batch that up and maintain consistency, if your bulk loading data sure and is common practice to do commit batches there, clustered indexes could still be a 100 vs 200x write amplification if you have to insert both an index row and heap row vs just a single clustered index row.
For inserts, you cannot escape writing into the base table and all indexes. However, my understanding is that for updates PostgreSQL has a write amplification problem due to the fact that each time a row is updated this creates a new row (to implement MVCC), and a new physical location in the heap, so all indexes need to be updated to point to the new location, even those not containing the updated columns.
OTOH, with a heap-less (aka. clustered, aka. index organized) table, you would only have to update the indexes containing the columns that are actually being updated. You don't need to touch any other index. Furthermore, only if you are updating a key column would you physically "move" the entry into a different part of the B-tree. If you update an included column (PK columns are automatically "included" in all secondary indexes, even if not explicitly mentioned in the index definition), you can do that in-place, without moving the entry.
Here is how this works in SQL Server - consider the following example:
CREATE TABLE T (
ID int,
NAME nvarchar(255) NOT NULL,
AMOUNT int NOT NULL,
CONSTRAINT T_PK PRIMARY KEY (ID)
);
GO
CREATE INDEX T_I1 ON T (NAME);
GO
CREATE INDEX T_I2 ON T (AMOUNT);
Now, doing this...
UPDATE T SET AMOUNT = 42 WHERE ID = 100;
...will only write to T_PK and T_I2, but not T_I1. Furthermore T_PK's entry will not need to be moved to a different place in the B-tree. SQL Server uses row versioning similar to PostgreSQL, so it's conceivable that PostgreSQL could behave similarly to SQL Server if it supported clustered (index-organized) tables.
That's true for seeks into the clustered (primary) index because that index includes all fields, so you don't need to "jump" to the heap to get them.
However, seeking into a secondary index, and then reading a column not included in that index incurs an additional index seek (into the clustered index), which may be somewhat slower than what would happen in a heap-based table.
I have found very minimal penalty on secondary index reads in practice such that it has never made a difference.
Remember some databases always use clustered index internally (SQLite, MySql) such that even if you have no primary key they will create a hidden one instead for use with the index.
It is nice to have the choice which way to go and would be nice if PG implemented this. It can have significant space savings on narrow table with one primary index and performance advantages.
Another option would be a good way of placing indexes on a different physical disk. You could use fast, ephemeral storage like you can for a WAL without amplifying the writes to the same device that is your expensive bottleneck. You could rebuild on data loss.
But it would add complexity to detect out-of-sync indexes and tables.
If you lose the WAL you lose the data since the last merge but there’s no risk of corruption. The WAL handles missed syncs fine, too, missing losing just that window of data.
I don’t know if or how Postgres records the transaction number in the index to be able to notice if it’s out of date. If it does, I don’t know of any solution to “catch up” the index besides recreating it, which would be ok if that’s the only issue but from my experience with out-of-date indexes (libc or icu updates, where Postgres doesn’t know if anything IS broken and just reports that it could be), there’s no guarantee you’d even notice and your app could be running completely broken until you rebuild.
>After running this command on a data directory with corrupted WAL or a corrupted control file, it should be possible to start the server, but bear in mind that the database might contain inconsistent data due to partially-committed transactions. You should immediately dump your data, run initdb, and restore. After restore, check for inconsistencies and repair as needed.
I guess that depends on the definition of corrupted. It just describes what to do when the WAL is corrupt, but doesn’t say what conditions are considered corrupt. (We use ZFS so I haven’t run into torn writes.)
CLUSTER command is not the same as index organized tables, it's a one-time "physical sort" operation. New data is not organized until you run CLUSTER again.
Index organized tables are maintained automatically by Oracle/SQL Server.
Not just maintained automatically, clustered indexes have no heap at all, the table is an index.
The CLUSTER command in PG just moves rows around in the heap so they match the still separate index order which can help a little bit with range operations because rows are close on disk, but otherwise doesn't do much.
So they are completely separate things that just happen to use the same term.
>Currently, constraint exclusion is enabled by default only for cases that are often used to implement table partitioning via inheritance trees. Turning it on for all tables imposes extra planning overhead that is quite noticeable on simple queries, and most often will yield no benefit for simple queries.
PG's lack of plan caching strikes again, this sort of thing is not a concern in other DB's that reuse query plans.
PG does reuse plans, but only if you prepare a query and run it more than 5 times on that connection. See plan_cache_mode[0] and the PREPARE docs it links to. This works great on simple queries that run all the time.
It sometimes really stinks on some queries since the generic plan can't "see" the parameter values anymore. E.g. if you have an index on (customer_id, item_id) and run a query where `customer_id = $1 AND item_id = ANY($2)` ($2 is an array parameter), the generic query plan doesn't know how many elements are in the array and can decide to do an elaborate plan like a bitmap index scan instead of a nested loop join. I've seen the generic plan flip-flop in a situation like this and have a >100x load difference.
The plan cache is also per-connection, so you still have to plan a query multiple times. This is another reason why consolidating connections in PG is important.
Yes manual query preparation by client [1] is what you did in MSSQL server up until v7.0 I believe, which was 1998 when it started doing automatic caching based on statement text. I believe it also cached stored procedures before v7.0 which is one reason they were recommended for all application code access to the database back then.
MSSQL server also does parameter sniffing now days and can have multiple plans based on the parameters values it also has a hint to guide or disable sniffing because many times a generic plan is actually better, again something else PG doesn't have, HINTS [2].
PG being process based per connection instead of thread based makes it much more difficult to share plans between connections and it also has no plan serialization ability. Where MSSQL can save plans to xml and they can be loaded on other servers and "frozen" to use that plan if desired, they can also be loaded into plan inspection tools that way as well [3].
In MSSQL Server part of the plan match is the various session/connection options, if they are different there are different plans cached.
I believe the plan data structure PG is intimately tied to process space memory addresses since it was never thought to share between them and can even contain executable code that was generated.
This makes it difficult to share between processes without a heavy redesign but would be a good change IMO.
> That would be like driving from Key West to Prudhoe Bay which looks to be 91 hours.
Haha, yeah, at least I got a laugh from it, thank you :) A fair comparison then I guess would be from Canary Islands to Svalbard, if we're aiming to make it as far as possible to make some imaginary point no one cares about :)
There is 400 amp residential service you can get 80 amp 19.2 kw level 2 chargers.
You would need 5 80 amp charger to approach 100kw but with other loads in a large house, I have seen large HVAC systems and elaborate pools with lazy rivers etc that can add up very quickly which is why they had 400 amp service.
100kw isn't really that much, a modern EV can put out 3 times that from its battery pack into the motor for short bursts and easily sustain 100kw until drained.
480v 200 amp 3 phase commercial supply can provide 100kw continuous and would be some thing used in a medium sized office building.
A watt of power multiplied by a second of time has an agreed upon name called joule, but a watt second is also a perfectly valid SI name.
A watt is a joule of energy divided by a second of time, this is a rate, joule per second is also a valid name similar to nautical mile per hour and knot being the same unit.
Multiplication vs division, quantity vs rate, see the relationship? Units may have different names but are equivalent, both the proper name and compound name are acceptable.
A watt hour is 3600 joules, it’s more convenient to use and matches more closely with how electrical energy is typically consumed. Kilowatt hour is again more directly relatable than 3.6 megajoules.
Newton meter and Coulomb volt are other names for the joule. In pure base units it is a kilogram-meter squared per second squared.
So when I torque all 20 of my car's lug bolts to 120 n-M, I've exerted 2/3 of a W-h? So if it takes me 4 minutes, I'm averaging 10 watts? That's neat. I wonder what the peak wattage (right as the torque wrench clicks) would be; it must depend on angular velocity.
Newton meter as a unit of energy is not the same as the newton meter unit of force for torque.
The energy unit meter is distance moved, while the force unit meter is the length of the moment arm.
This is confusing even though valid, so the energy unit version is rarely used.
You can exert newton meters of force while using no energy, say by standing on a lug nut wrench allowing gravity to exert the force indefinitely unless the nut breaks loose.
Ah! I guess that explains the "f" for "force" in the imperial abbreviation "ft-lbf", to distinguish it from work. I wonder if there's ever been an analogous variant for metric such as "Nmf"...
It seems the common thread is that the f means to introduce G, but not exactly. In my own research, the AI summaries are about as sloppy as I've ever seen, due to the vague and often regional differences (with the difference between ft-lb and lb-ft sometimes being described as relevant, as well).
King does this all the time in his stories having character connections across different novels, making them set in the same universe. Fun, adds some depth to all of it. Like Randal Flagg being the same villain in the Stand and the Dark tower and Eyes of the Dragon.
reply