Statistics Revolution and Performance Polish
Today brings major enhancements to PostgreSQL's statistics system with new extended stats restoration functions, plus some solid performance optimizations and code cleanup. Michael Paquier and team deliver powerful new tools for database migrations, while David Rowley squeezes out extra performance from sequential scans.
Duration: PT4M35S
Transcript
Hey there, PostgreSQL enthusiasts! Welcome back to another episode of the PostgreSQL podcast. I'm your host, and wow, do we have some exciting developments to dive into today - January 26th, 2026. Grab your favorite beverage because we're about to explore some really cool improvements that landed in the codebase.
Let me start with the star of today's show - a fantastic new feature that's going to make database administrators everywhere smile. Michael Paquier, working with Corey Huinker and reviewed by Chao Li, has introduced something called `pg_restore_extended_stats()`. Now, I know that might sound a bit technical, but here's why this is actually huge.
Think about this scenario: you're upgrading your PostgreSQL cluster or doing a major dump and restore operation. Traditionally, after you restore your data, you'd have to run ANALYZE to rebuild all those important statistics that help PostgreSQL make smart decisions about how to execute your queries. That takes time - sometimes a lot of time on large databases.
Well, this new function changes that game completely. It's like having a time machine for your database statistics. You can now restore extended statistics directly - things like n_distinct values that tell PostgreSQL how many unique values are in your columns. The beauty is that this is just the beginning. The infrastructure is there for more statistics types like multi-column statistics and dependencies in future commits.
What I love about this implementation is how thoughtful it is. The function requires proper permissions - you need to be the database owner or have MAINTAIN privileges on the table. It even takes the right locks to ensure data integrity. These aren't just features thrown over the wall; this is production-ready, enterprise-grade functionality.
Speaking of testing, they didn't just ship this and hope for the best. There are comprehensive tests covering multirange types, permission checking, and various edge cases. Michael even caught and fixed a small initialization bug in a follow-up commit - that's the kind of attention to detail that makes PostgreSQL rock-solid.
Now, let's talk performance because David Rowley has been busy making things faster. There's a really interesting fix around window functions that caught my eye. The system was doing some deduplication logic that seemed helpful but was actually causing cost estimation problems. Sometimes the best optimization is removing unnecessary complexity, and that's exactly what happened here.
But the performance win I'm most excited about is David's work on sequential scans. He noticed that two critical functions, SeqNext and SeqRecheck, weren't being inlined by the compiler as intended. By marking them with `pg_attribute_always_inline`, he's forcing the compiler to inline them, and the results speak for themselves - a 3.9% speedup on sequential scans in his tests. That might not sound massive, but when you're dealing with millions of rows, every percentage point matters.
There's also some nice housekeeping happening. Peter Eisentraut cleaned up some compiler compatibility issues with older g++ versions, and there was a removal of the PG_MMAP_FLAGS macro that was causing confusion since it was only used in one specific place. These kinds of cleanups might not be glamorous, but they make the codebase more maintainable for everyone.
And I have to give props to the expanded test coverage for STORAGE clauses and TOAST interactions. Michael Paquier noticed some gaps in testing around how different storage strategies work with TOAST tables, and now we have much better coverage. This is exactly the kind of proactive quality work that prevents bugs from making it into production.
Today's focus for all of us should be thinking about our own database maintenance strategies. If you're planning any major upgrades or migrations, keep an eye on when this extended statistics restoration functionality becomes available in a release near you. It could save you significant downtime.
For developers, take a page from today's commits - comprehensive testing, careful permission handling, and performance measurement should be part of every feature we build.
That's a wrap for today's episode! The PostgreSQL community continues to amaze me with the thoughtfulness and quality of these improvements. Until next time, keep coding, keep optimizing, and remember - every commit is a step toward making databases better for everyone. Catch you tomorrow!