Identifying poorly indexed tables in Postgres
Out of the box the Postgres statistics collector gives us a variety of useful views we can query to understand the usage and performance of a database. This is a huge topic, so I'm just going to look at one very constrained topic in this post.
Sequential scans are OK on small tables or if we are making them infrequently, but usually if we are performing a lot of sequential scans on large tables this is a sign that we need to create an index. Fortunately Postgres collects some data in pg_stat_user_tables
that can help us identify this:
- Looking at
seq_scan
andidx_scan
can show us tables where we are frequently performing sequential scans rather than indexed scans - Looking at
seq_scan
andseq_tup_read
can show us tables where sequential scans typically process a large number of rows
Running this query:
SELECT relname, idx_scan, seq_scan, seq_tup_read, seq_tup_read / seq_scan AS average_tuples_per_scan, ((seq_scan / (seq_scan + idx_scan) :: FLOAT) * 100) AS percent_sequential_scans FROM pg_stat_user_tables WHERE seq_scan > 0 ORDER BY average_tuples_per_scan DESC;
Should provide a pretty clear picture of tables that need optimisation.