Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Looking Forward to Postgres 19: Query Hints (pgedge.com)
219 points by jjgreen 1 day ago | hide | past | favorite | 42 comments
 help



Its also worth reading the original post by Robert Haas (the author of pg_plan_advice) on motivation/design: https://rhaas.blogspot.com/2026/03/pgplanadvice-plan-stabili...

Also, I'll add my perspective: I think "EXPLAIN (PLAN_ADVICE)" is a key piece to making this a plan stability feature, not (just) a hinting feature. The extensibility/framework pg_plan_advice adds is a foundation, that over time will over time address the age-old "Postgres doesn't have hints" problem, even if the initial release doesn't check all the boxes yet, e.g. no way to use advice for adjusting row/join estimates.

To give an example on extensibility: Some people that I've spoken to are asking "but why is it not a comment-style hint". There are reasons why Postgres didn't go that way for this release (comment parsing in core is non-existent today, and comments don't work correctly e.g. for functions), but its easy to write an extension that sets up an advisor hook to parse comments: https://github.com/pganalyze/pg_advice_comment


At a past job, I managed the DBA team that owned the books and records database for a major hedge fund.

Some quick stories about databases in general and type hints in particular:

1. Some of the joins involved 10+ tables AND had query hints

That sounds cool until you realize that the queries were designed for the planner from 3 versions ago

2. Was in multiple conversations with head of the DBA team and application owners that went something like this:

Developer: "This query hasn't changed in YEARS! Why is it not performing well now??"

Me: "Have you considered that maybe the data cardinality has changed?" and then I or the lead DBA would explain how database indices worked.

I mention the above only as a warning to folks newer to DBA land that some of these features can seem great but can also be a crutch. What should have happened is that the schema, indices or application code should have been reviewed by the app teams (with the DBAs) periodically to ensure assumptions were correct.

(This was several years ago so I'm curious what kind of impact the LLMs have had on this)


LLMs are an interesting call out. I've been trying with the idea of a query optimizer agent that analyzes PG slow query/auto explain logs.

I've only tried Claude Opus but it does a pretty good job interpreting the plan. It's also really powerful being able to bring in telemetry context and code to help make a more balanced tradeoff with usage metrics (much easier to callout anomolies, edge cases, and non performance sensitive contexts)

That said fixing the queries which usually originate from Django ORM is a bit hit or miss. Been multiple times the LLM wants to rewrite to something similar but not identical or wants to make large, structural changes instead of focusing on low hanging surgical improvements

Once it was even able to pull PG mailing lists to figure out I was hitting a planner predicate comparison limitation where switching from "bool = false" to "not bool" or maybe the other way around led to a massive 100x+ improvement (the planner was skipping using an index because the predicate in the query didn't "match" the predicate in the partial index despite the conditions being logically equivalent)

Would have taken days or weeks for me to figure that out on my own


You can tell the LLM to optimize the Django ORM. With ORMs, large structural changes are sometimes needed for small query text changes due to how ORMs generate the SQL.

I’m a SWE and at my previous job it fell to me to optimize some queries that used EF Core ORM. I gave opus a local db with a small export of anonymized prod data, had it generate a billion rows with similar cardinality.

Then I told opus how to get the raw SQL and told it to write unit tests with various optimizations.

It got the query down from 5+ minutes to a few seconds. I verified the final SQL and ORM structure.

All it needed was to modify some indices and fix up the ORM to properly generate the lateral join.

However, to get the join working properly required a full restructure of the ORM query.


>You can tell the LLM to optimize the Django ORM

I have and have come to the conclusion it's too risky. On too many attempts it hallucinates correctness and subtly breaks edge cases

>large structural changes are sometimes needed for small query text changes due to how ORMs generate the SQL

Sometimes that's true but a lot of times you can get 90% the benefit with significantly lower risk with a much smaller targeted changes.

It can check correctness against real data but I've still hit problems where you can just look at the rewrite and identify behavior differences that don't necessarily show up in 1 off execution comparison unless you brute force your way through tons of test data.

A lot of the cases I'm running into are somewhat older code (5+ years) with poor test coverage that few people understand or remember. Some of these probably have legitimate bugs but it's risky assuming and safer to ensure an optimization is identical.


Having a good DB monitoring setup can also help a lot. We (over)pay for datadog's Database monitoring APM, but it is decent in letting you see how queries are run and change over time. It also checks query plans and lets you know when too many sequential scans are being performed, as well as indexes that are no longer used (in large DBs removing them can alleviate a lot of write IOPs).

It's not perfect and still requires thought and understanding (it'll recommend removing unused indexes on the primary keys, for example), but SOMETHING needs to continuously monitor anything but the smallest, simplest relational DBs.


Yeah slowly building up to that, we noped out of DD APM because leadership does not like seeing the $ amount of the bill (despite it being a small fraction of AWS and consisting of 10s of services in 1 platform)

> How many of us have toggled enable_seqscan to off to force an index scan? Or thrown an OFFSET 0 into a subquery to prevent the planner from flattening it?

enable_nestloop = off here.

For us, joining many complex views quickly trips the planner up, so I'm really glad to see this.

> They break on upgrades.

The irony is so does the planner. I've seen queries working perfectly fine in older PG's suddenly run away in newer versions. So hints will actually bring stability.


The planner breaking on updates is common for almost all RDBMSs. They introduce optimizations that work great for 95% of customers, and some will just have queries that now act like cardinality is way off or covering indexes are missing.

This issue was one of AWS's listed reasons for tending to prefer NoSQL style databases over "more performant" RDBMS, because of the more consistent worst-case performance, even if the result is worse average-case performance, which was important in their assumptions for scalability planning.

Every single time I’ve thought I’ve needed to try these it made it worse.

Every time Claude tries to tell me to try these, it made it worse.

Not once has it made it better for me. I’m doing materialized view refreshes with a billion rows, which is small enough maybe that this doesn’t come into play…but so far the planner knows best.

If the database can’t make it fast with just smart joins and filtering then it’s the architecture that’s a problem, not the database.

Usually the only thing I need to do is increase work_mem.


The hints are needed when you have a web scale / online transaction processing application, where you want to guarantee no table scan ever. Ideally, I'd want to rrun without statistics on the tables in that case (ie, no smart joins).

> where you want to guarantee no table scan ever.

If hints are what they say they are, they cannot guarantee anything.

And they indeed are hints. FTA: “The documentation is explicit: advice "can only produce plans the core planner considers viable." Advice only nudges the planner toward one it already considered.”


honestly I dont want plan hinting so much as being able to ban table-scanning / nested loops / etc on specific tables, and be able to set that independent of actual table-size so I can coerce its failure in dev environments

+1 I agree... (except nested loops are rarely the problem, they are often the best/simplest alternative for join).

“I’ve never run into it so it can’t be a problem for other people” is an interesting take.

Shudder. Flashbacks to having to write optimiser hints in Oracle (and the resulting fun times when you'd upgrade the database, something would change, and your hints would make a query slower).

As opposed to not upgrading and the planner picking a plan at 3am that is slower?

I can't believe we're still doing this. You should be able to dictate the query execution without involving a planner.

I really don’t think you’re considering all the edge cases in such a scenario.

Most devs have problems writing decent queries (in some situations), now you want to introduce writing the query execution plan into the mix ?


No one is taking away the existing automatic planning that works well 95% of the time. You're welcome to continue using that.

The worst thing that could possibly happen is that you give it bad advice leading to slow queries, and then the obvious first step to fixing that is to drop the manual advice and see whether the automatic planner handles it better.

It baffles me that PostgreSQL, which is so deeply customisable in almost every other way, resisted this form of customisability for so long. This is great news.


I'm against this because hints should be a last resort in declarative programming languages like SQL.

Our productivity is proportional to our ability to recruit abstractions. The more we deal with pure concepts like relational algebra and data-driven development, the more bang we get for our buck.

If we get lost in the weeds having to worry about doing the planner's job, it's like we're paying a tax that doesn't need to exist.

This is why the syntactic sugar of Ruby, the async design patterns of JavaScript, the footgun avoidance best practices of C++, even the impure workarounds of functional languages, (all meant to improve developer and/or execution performance) don't really do it for me. They hint at avoidance of deeper understanding. Once we learn higher abstractions like copy-on-write, compare-and-swap, higher-order methods, etc, we start to see that languages pass the cost of their externalities on to us.

I'd prefer that Postgres move the opposite direction. For example, databases need a universal index that turns as many operations as possible into O(1) at the cost of memory, since resource prices tend to always fall on a long enough timescale. Stuff that works more like a content-addressable memory for ludicrous scaling. In other words, whatever it takes to make planner hints obsolete, is what Postgres maintainers should be putting their efforts into.

I guess a stopgap might be to have an automated way to profile an app during testing and generate planner hints for the main use cases. Or maybe be able to cache them to avoid cold start latency. But if my work ever requires me to deal with them directly, I'll be treating that as a code smell.

-

After writing this out, I realized that performance is an orthogonal concern to conceptual correctness. So a more appropriate phrasing might be that the planner is none of SQL's business. So technically, anyone's opinion on it is valid. In which case, we should choose the path of kindness. If allowing access to the planner saves someone's bacon, than we should allow it. But work to alleviate whatever pain necessitated its use in the first place.


I'm not an expert in database hints, but the syntax looks very readable and composable. That's great thing to have got right.

> The advice language is surprisingly expressive for something the community resisted for decades.

FINALLY!

I like this design.

And yes, the community resisted this for way too long.


Sounds like they were waiting for a good design.

The rejections over the years that I saw were not about design.

Imagine you have a SAAS app. Microsoft and Apple are customers. You have a table of devices. Type is either windows or osx. How does postgres know how to handle the query devices, where type = OSX properly? How does it know that this matches ~0 or ~100 of rows depending on the customer?

This is the main thing the planner doesn't handle well. Postgres was built before SAAS was as big. You have different distributions per customer, and thousands of customers. In most cases the query planner will guess right, but sometimes it will fail and scan millions of rows.


You can manually add more advanced statistics in PostgreSQL. That includes statistics over multiple columns.

https://www.postgresql.org/docs/current/planner-stats.html#P...


Give the customer what they want, even if it sucks to do so. The alternative it be cast into irrelevance over time. You can run an OSS project however you want, but you can't avoid the consequences of doing so.

Principles driven development (we will never or always do X regardless of context) typically comes off as a petty ego trip. The point of the technology is to serve some kind of downstream business. Most people who download Postgres are seeking to solve a real world problem, not to demonstrate their ideological purity.


What are you talking about? This has nothing to do with the article, which describes a query plan advice provider that solves lots of problems with typical hint systems, and does so in a manner good enough to be worthy of including in core, rather than throwing in a half-baked broken clone of Oracle because people are scared to install an extension.

> A Brief History of "Never"

> The Postgres community's position on query hints has been, shall we say, firm. The official wiki page on the subject states it plainly: "We are not interested in implementing hints in the exact ways they are commonly implemented on other databases. Proposals based on "because they've got them" will not be welcomed."

Are we reading the same article?


FINALLY!

I never understood the issues PG had with hints. Running a non-trivial DB with a non-trivial schema and scale is, well, non-trivial at all. At some point the DB stops being a black box and starts being a tool that you have to know inside and out to avoid performance issues.

>The optimizer is usually smarter than you think.

Except for when it isn't, and moves heavy calculation inside a nested loop inside a nested loop to avoid an index scan. Nothing is perfect.


> > The optimizer is usually smarter than you think.

> Except for when it isn't, and moves heavy calculation inside a nested loop inside a nested loop to avoid an index scan. Nothing is perfect.

Yeah that's also been my experience. It's true that Postgres is usually smarter than I think, when I try to figure out why it's not using a better query plan I eventually find out that it wouldn't be better at all. But from time to time, it genuinely is taking a bad decision and having no power over that at all is a problem


Yeah, this article explains the reasons why, and how this solution avoids them.

I have a naive question: why did this take 15 years? I understand that good APIs need time and thoughtful design, but I struggle to understand why we couldn’t get to the same (or better) solution faster.

It was ideological goal: they wanted to build query optimizer which would always be smarter than human, but looks like they admitted defeat after many years of stubbornness.

Very interesting - I just installed pg_hint_plan [0] extension a few months ago to get around a query that was confusing the planner too much. Edge case, but when you need it you really need it.

Haven't seen pg_plan_advice before, TIL!


man, Tom Lane has hated query hints for literally decades

did he finally come around?


I don't think Tom's perspective has necessarily changed (and there is certainly concern from others that this could cause less reports on planner bugs), but Tom is pretty good about not standing in the way of others (i.e. Robert Haas in this case) trying to make things work, and being open to new perspectives.

I do know that one of the important criteria for getting this in was that a bad advice can't cause the planner to fail, and that's something that was explicitly included in the design of pg_plan_advice.


OT: This reminded me of OrioleDB, what happened to it?

We keep going. Planning GA till the end of this year. Only a couple of missing features left. The main focus is stability.

Thanks for the update. Since there wasn't any new blog post I was a little worried.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: