pg_hint_plan -- Tweaking planner decisions with user's hists.
PostgreSQL executes queries using cost-based planner, and sometimes fails to choose most optimized execution plan for some types of queries.
In such cases, pg_hint_plan helps users to guide the planner to choose the plans they want by putting some instructions - call them hints - in the comments prefixed to the SQL statement body. No need to rewrite statement itself nor change some GUC parameters elsewhere.
Hints to planner should be scripted in a bit special comment surrounded by '/*+' and '*/' preceding target SQL statement body. Each hint is in the format of "HintName(target specs)" and separated from others by white spaces. "Target specs" is one table name for the hints for scans or table names separated with white spaces for the hints for joins or others. Set hint takes a pair of the GUC parameter name and the value to be set separated with white spaces.
In the example below, HashJoin(a b) instructs that hash join is preferable for joins covering table (designated with alias) a and b, and SeqScan(a) instructs that sequential scan is preferable for table a, and the selected plan is found to follow the order.
postgres=# /*+ postgres*# HashJoin(a b) postgres*# SeqScan(a) postgres*# */ postgres-# EXPLAIN SELECT * postgres-# FROM pgbench_branches b postgres-# JOIN pgbench_accounts a ON b.bid = a.bid postgres-# ORDER BY a.aid; QUERY PLAN --------------------------------------------------------------------------------------- Sort (cost=31465.84..31715.84 rows=100000 width=197) Sort Key: a.aid -> Hash Join (cost=1.02..4016.02 rows=100000 width=197) Hash Cond: (a.bid = b.bid) -> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=1.01..1.01 rows=1 width=100) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100) (7 rows)
The hint for scan methods takes one table name or alias and will instruct planner to prefer the specific scan method for the table. You can use (No)SeqScan, (No)IndexScan, (No)TidScan, (No)BitmapScan.
The hint for scan methods takes one table name or alias and will instruct planner to prefer the specific scan method for the table. You can use (No)SeqScan, (No)IndexScan, (No)TidScan, (No)BitmapScan.These hints are applicable for ordinary tables including unlogged or temporary ones, and system catalogues. Other objects, for example, external tables, table functions, Views, CTE, or subquery results are not covered.
The hint for join methods takes two or more table names or alias and will instruct planner to prefer the specific method for the join operation covering all of the tables. For example, HashJoin(a b c) tells the planner to prefer hash join for both of join operations on a and the result of sub join on b and c, or on c and the result of sub join on a and b. You can use (No)NestLoop, (No)HashJoin, (No)MergeJoin.
These hints are applicable for ordinary tables including unlogged or temporary ones, system catalogues; besides, also applicable for table functions, VALUES clauses, CTEs. Views and subquery results are not covered.
Leading(a b c ..) takes a list of objects which are join hints are appliable to and suggests planner to do joins in that order. For this example, join a and b first, then join the result and c, and so on... In and out of each join operation cannot be tweaked in current version.
The Set hint has somewhat different syntax to others. It takes a pair of GUC parameter name and the value to be set, sets the value to the parameter and restore to the previous value when the planner finishes the work. So this hint has no effect on execution. The last hint in Set hints for the same GUC parameter is in effenct.
Only GUC parameter of Query Planning will have the effect. If Hint of multiple GUC parameters is specified for same GUC parameter, Hint specified in the last will be applicable.
You can use GUC of pg_hint_plan in 'Set' hint,but since it does not work as per expectation, it is recommended that not to specify. See also Restrictions.
GUC parameters to define the behaviors of pg_hint_plan are listed below.
name | description | default value |
---|---|---|
pg_hint_plan.enable_hint | Enables or disables the all function of pg_hint_plan module. | on |
pg_hint_plan.debug_print | Enables logging of pg_hint_plan's debugging informatin. The log records has the LOG LogLevel. | off |
pg_hint_plan.parse_messages | Enables logging for parse failure of hints. The log level of the records are specified by this parameter. Valid values are the all loglevels smaller than fatal and null string. |
do 'make' in the base directory of pg_hint_plan then 'make install' as the same OS user as PostgreSQL. If you installed PostgreSQL by RPM, it should be installed from postgresq-devel* package.Example of build is given below
user$ tar xzvf pg_hint_plan-1.0.0.tar.gz user$ cd pg_hint_plan-1.0.0 user$ make user$ su postgres posrgres$ make install
If you want to use pg_hint_plan only in current session, you can load it by LOAD command of PostgreSQL like the example session below. As described in PostgreSQL documentation, non-superusers can only apply LOAD to library files located in $libdir/plugins.
postgres=# LOAD 'pg_hint_plan'; LOAD
If you want to load pg_hint_plan automatically in every session, add 'pg_hint_plan' to shared_preload_libraries in postgresql.conf and restart the server.
You can see the typical setting in postgresql.conf below.
# postgresql.conf shared_preload_libraries = 'pg_hint_plan' custom_variable_classes = 'pg_hint_plan' # Only for 9.1 pg_hint_plan.parse_messages = 'debug2'
When you use PostgreSQL 9.1 with pg_hint_plan, setting custom_variable_classes is required for the class pg_hint_plan in postgresql.conf.
do 'make uninstall' in the same directory as installation and as the same user as PostgreSQL.
$ cd pg_hint_plan-1.0.0 $ su # make uninstall
In the example below, SeqScan(t1) tells that the user wants to scan t1 with sequential scan, and IndexScan(t2 t2_pkey) tells that index scan using t2_pkey is preferred to scan t2.
postgres=# /*+ postgres*# SeqScan(t1) postgres*# IndexScan(t2 t2_pkey) postgres*# */ postgres-# SELECT * FROM table1 t1 JOIN table table2 t2 ON (t1.key = t2.key); ...
NestLoop(t1 t2) tells the user wants that neted loop should be applied to the join operation covering t1 and t2, and MergeJoin(t1 t2 t3) tells merge join is preferred for the join operation covering all of the three tables t1, t2, t3, not the part of them. Leading(t1 t2 t3) tells that join operations should take place in the order even if the planner could find the cheaper plan doing joins in another order. The planner finally picks up the plan shown below.
postgres=# /*+ postgres*# NestLoop(t1 t2) postgres*# MergeJoin(t1 t2 t3) postgres*# Leading(t1 t2 t3) postgres*# */ postgres-# SELECT * FROM table1 t1 postgres-# JOIN table table2 t2 ON (t1.key = t2.key) postgres-# JOIN table table3 t3 ON (t2.key = t3.key); ...
Set(random_page_cost 2.0) tells that the planner gets 2.0 as the value of random_page_cost for planning the subsequent statements.
postgres=# /*+ postgres*# Set(random_page_cost 2.0) postgres*# */ postgres-# SELECT * FROM table1 t1 WHERE key = 'value'; ...
postgres=# /*+ postgres*# HashJoin(a b) postgres*# SeqScan(a) postgres*# */ postgres-# /*+ IndexScan(a) */ postgres-# EXPLAIN SELECT /*+ MergeJoin(a b) */ * postgres-# FROM pgbench_branches b postgres-# JOIN pgbench_accounts a ON b.bid = a.bid postgres-# ORDER BY a.aid; QUERY PLAN --------------------------------------------------------------------------------------- Sort (cost=31465.84..31715.84 rows=100000 width=197) Sort Key: a.aid -> Hash Join (cost=1.02..4016.02 rows=100000 width=197) Hash Cond: (a.bid = b.bid) -> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=1.01..1.01 rows=1 width=100) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100) (7 rows) postgres=#
postgres=# /*+ HashJoin(t1 t1)*/ postgres-# EXPLAIN SELECT * FROM s1.t1 postgres-# JOIN public.t1 ON (s1.t1.id=public.t1.id); INFO: hint syntax error at or near "HashJoin(t1 t1)" DETAIL: Relation name "t1" is ambiguous. QUERY PLAN ------------------------------------------------------------------ Merge Join (cost=337.49..781.49 rows=28800 width=8) Merge Cond: (s1.t1.id = public.t1.id) -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: s1.t1.id -> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4) -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: public.t1.id -> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4) (8 行) postgres=# /*+ HashJoin(pt st) */ postgres-# EXPLAIN SELECT * FROM s1.t1 st postgres-# JOIN public.t1 pt ON (st.id=pt.id); QUERY PLAN --------------------------------------------------------------------- Hash Join (cost=64.00..1112.00 rows=28800 width=8) Hash Cond: (st.id = pt.id) -> Seq Scan on t1 st (cost=0.00..34.00 rows=2400 width=4) -> Hash (cost=34.00..34.00 rows=2400 width=4) -> Seq Scan on t1 pt (cost=0.00..34.00 rows=2400 width=4) (5 行) postgres=#
postgres=# CREATE VIEW view1 AS SELECT * FROM table1 t1; CREATE TABLE postgres=# /*+ SeqScan(t1) */ postgres=# EXPLAIN SELECT * FROM table1 t1 JOIN view1 t2 ON (t1.key = t2.key) WHERE t2.key = 1; QUERY PLAN ----------------------------------------------------------------- Nested Loop (cost=0.00..358.01 rows=1 width=16) -> Seq Scan on table1 t1 (cost=0.00..179.00 rows=1 width=8) Filter: (key = 1) -> Seq Scan on table1 t1 (cost=0.00..179.00 rows=1 width=8) Filter: (key = 1) (5 rows) postgres=# /*+ SeqScan(t3) */ postgres=# EXPLAIN SELECT * FROM table1 t3 JOIN view1 t2 ON (t1.key = t2.key) WHERE t2.key = 1; QUERY PLAN -------------------------------------------------------------------------------- Nested Loop (cost=0.00..187.29 rows=1 width=16) -> Seq Scan on table1 t3 (cost=0.00..179.00 rows=1 width=8) Filter: (key = 1) -> Index Scan using foo_pkey on table1 t1 (cost=0.00..8.28 rows=1 width=8) Index Cond: (key = 1) (5 rows)
postgres=# /*+ MergeJoin(a *VALUES*) */ postgres-# EXPLAIN SELECT * postgres-# FROM pgbench_accounts a postgres-# JOIN (VALUES (1,1),(2,2)) v (vid, vbalance) ON a.aid = v.vid postgres-# ORDER BY a.aid; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Merge Join (cost=0.04..4497.33 rows=2 width=105) Merge Cond: (a.aid = "*VALUES*".column1) -> Index Scan using pgbench_accounts_pkey on pgbench_accounts a (cost=0.00..4247.26 rows=100000 width=97) -> Sort (cost=0.04..0.04 rows=2 width=8) Sort Key: "*VALUES*".column1 -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=8) (6 rows) postgres=#