PostgreSQL-14 was released in September 2021, and it contained many performance improvements and feature enhancements, including some features from a monitoring perspective. As we know, monitoring is the key element of any database management system, and PostgreSQL keeps updating and enhancing the monitoring capabilities. Here are some key ones in PostgreSQL-14.
Query Identifier
Query identifier is used to identify the query, which can be cross-referenced between extensions. Prior to PostgreSQL-14, extensions used an algorithm to calculate the query_id. Usually, the same algorithm is used to calculate the query_id, but any extension can use its own algorithm. Now, PostgreSQL-14 optionally provides a query_id to be computed in the core. Now PostgreSQL-14’s monitoring extensions and utilities like pg_stat_activity, explain, and in pg_stat_statments use this query_id instead of calculating its own. This query_id can be seen in csvlog, after specifying in the log_line_prefix. From a user perspective, there are two benefits of this feature.
- All the utilities/extensions will use the same query_id calculated by core, which provides an ease to cross-reference this query_id. Previously, all the utilities/extensions needed to use the same algorithm in their code to achieve this capability.
- The second benefit is extension/utilities can use calculated query_id and don’t need to again, which is a performance benefit.
PostgreSQL introduces a new GUC configuration parameter compute_query_id to enable/disable this feature. The default is auto; this can be turned on/off in postgresql.conf file, or using the SET command.
- pg_stat_activity
SET compute_query_id = off;
1 2 3 4 5 | SELECT datname, query, query_id FROM pg_stat_activity; datname | query | query_id ----------+-----------------------------------------------------------------------+---------- postgres | select datname, query, query_id from pg_stat_activity; | postgres | UPDATE pgbench_branches SET bbalance = bbalance + 2361 WHERE bid = 1; | |
SET compute_query_id = on;
1 2 3 4 5 | SELECT datname, query, query_id FROM pg_stat_activity; datname | query | query_id ----------+-----------------------------------------------------------------------+--------------------- postgres | select datname, query, query_id from pg_stat_activity; | 846165942585941982 postgres | UPDATE pgbench_tellers SET tbalance = tbalance + 3001 WHERE tid = 44; | 3354982309855590749 |
- Log
In the previous versions, there was no mechanism to compute the query_id in the server core. The query_id is especially useful in the log files. To enable that, we need to configure the log_line_prefix configuration parameter. The “%Q” option is added to show the query_id; here is the example.
1 | log_line_prefix = 'query_id = [%Q] -> ' |
1 2 3 4 5 | query_id = [0] -> LOG: statement: CREATE PROCEDURE ptestx(OUT a int) LANGUAGE SQL AS $$ INSERT INTO cp_test VALUES (1, 'a') $$; query_id = [-6788509697256188685] -> ERROR: return type mismatch in function declared to return record query_id = [-6788509697256188685] -> DETAIL: Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING. query_id = [-6788509697256188685] -> CONTEXT: SQL function "ptestx" query_id = [-6788509697256188685] -> STATEMENT: CREATE PROCEDURE ptestx(OUT a int) LANGUAGE SQL AS $$ INSERT INTO cp_test VALUES (1, 'a') $$; |
- Explain
The EXPLAIN VERBOSE will show the query_id if compute_query_id is true.
SET compute_query_id = off;
1 2 3 4 5 6 7 | EXPLAIN VERBOSE SELECT * FROM foo; QUERY PLAN -------------------------------------------------------------- Seq Scan on public.foo (cost=0.00..15.01 rows=1001 width=4) Output: a (2 rows) |
SET compute_query_id = on;
1 2 3 4 5 6 7 | EXPLAIN VERBOSE SELECT * FROM foo; QUERY PLAN -------------------------------------------------------------- Seq Scan on public.foo (cost=0.00..15.01 rows=1001 width=4) Output: a Query Identifier: 3480779799680626233 (3 rows) |
autovacuum and auto-analyze Logging Enhancements
PostgreSQL-14 improves the logging of auto-vacuum and auto-analyze. Now we can see the I/O timings in the log, showing how much has been spent reading and writing.
1 2 3 4 5 6 7 8 9 10 11 12 13 | automatic vacuum of table "postgres.pg_catalog.pg_depend": index scans: 1 pages: 0 removed, 67 remain, 0 skipped due to pins, 0 skipped frozen tuples: 89 removed, 8873 remain, 0 are dead but not yet removable, oldest xmin: 210871 index scan needed: 2 pages from table (2.99% of total) had 341 dead item identifiers removed index "pg_depend_depender_index": pages: 39 in total, 0 newly deleted, 0 currently deleted, 0 reusable index "pg_depend_reference_index": pages: 41 in total, 0 newly deleted, 0 currently deleted, 0 reusable I/O timings: read: 44.254 ms, write: 0.531 ms avg read rate: 13.191 MB/s, avg write rate: 8.794 MB/s buffer usage: 167 hits, 126 misses, 84 dirtied WAL usage: 85 records, 15 full page images, 78064 bytes system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.07 s |
These logs are only available if track_io_timing is enabled.
Connecting Logging
PostgreSQL already logs the connection/disconnection if log_connections/log_disconnections is on. Therefore, PostgreSQL-14 now also logs the actual username supplied by the user. In case some external authentication is used, and mapping is defined in pg_ident.conf, it will become hard to identify the actual user name. Before PostgreSQL-14, you only see the mapped user instead of the actual user.
pg_ident.conf
1 2 3 | # MAPNAME SYSTEM-USERNAME PG-USERNAME pg vagrant postgres |
pg_hba.conf
1 2 3 | # TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only local all all peer map=pg |
Before PostgreSQL-14
1 2 3 4 | LOG: database system was shut down at 2021-11-19 11:24:30 UTC LOG: database system is ready to accept connections LOG: connection received: host=[local] LOG: connection authorized: user=postgres database=postgres application_name=psql |
PostgreSQL-14
1 2 3 4 | LOG: database system is ready to accept connections LOG: connection received: host=[local] LOG: connection authenticated: identity="vagrant" method=peer (/usr/local/pgsql.14/bin/data/pg_hba.conf:89) LOG: connection authorized: user=postgres database=postgres application_name=psql |
Conclusion
Every major PostgreSQL release carries significant enhancements, and PostgreSQL-14 was no different.
Monitoring is a key feature of any DBMS system, and PostgreSQL keeps upgrading its capabilities to improve its logging and monitoring capabilities. With these newly added features, you have more insights into connections; one can easily track queries and observe performance, and identify how much time is being spent by the vacuum process in read/write operations. This can significantly benefit you in configuring vacuum parameters better.
No comments:
Post a Comment