pg_hint_plan 1.0.0


  1. Name
  2. Synopsis
  3. Description
  4. Install
  5. Uninstall
  6. Examples
  7. Restrictions
  8. Requirements
  9. See Also
  10. Appendix A. Hints list

name

pg_hint_plan -- Tweaking planner decisions with user's hists.

Synopsis

PostgreSQL executes queries using cost-based planner, and sometimes fails to choose most optimized execution plan for some types of queries.

In such cases, pg_hint_plan helps users to guide the planner to choose the plans they want by putting some instructions - call them hints - in the comments prefixed to the SQL statement body. No need to rewrite statement itself nor change some GUC parameters elsewhere.

Description

General syntax

Hints to planner should be scripted in a bit special comment surrounded by '/*+' and '*/' preceding target SQL statement body. Each hint is in the format of "HintName(target specs)" and separated from others by white spaces. "Target specs" is one table name for the hints for scans or table names separated with white spaces for the hints for joins or others. Set hint takes a pair of the GUC parameter name and the value to be set separated with white spaces.

In the example below, HashJoin(a b) instructs that hash join is preferable for joins covering table (designated with alias) a and b, and SeqScan(a) instructs that sequential scan is preferable for table a, and the selected plan is found to follow the order.

postgres=# /*+
postgres*#    HashJoin(a b)
postgres*#    SeqScan(a)
postgres*#  */
postgres-# EXPLAIN SELECT *
postgres-#    FROM pgbench_branches b
postgres-#    JOIN pgbench_accounts a ON b.bid = a.bid
postgres-#   ORDER BY a.aid;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Sort  (cost=31465.84..31715.84 rows=100000 width=197)
   Sort Key: a.aid
   ->  Hash Join  (cost=1.02..4016.02 rows=100000 width=197)
         Hash Cond: (a.bid = b.bid)
         ->  Seq Scan on pgbench_accounts a  (cost=0.00..2640.00 rows=100000 width=97)
         ->  Hash  (cost=1.01..1.01 rows=1 width=100)
               ->  Seq Scan on pgbench_branches b  (cost=0.00..1.01 rows=1 width=100)
(7 rows)

Tweakable operations

Pg_hint_plan can tweak planner for table scan methods, join methods, join orders, and temprary GUC setting during the planner is running. For specific Hint of each group, see Hint list

Tweaking scan method

The hint for scan methods takes one table name or alias and will instruct planner to prefer the specific scan method for the table. You can use (No)SeqScan, (No)IndexScan, (No)TidScan, (No)BitmapScan.

The hint for scan methods takes one table name or alias and will instruct planner to prefer the specific scan method for the table. You can use (No)SeqScan, (No)IndexScan, (No)TidScan, (No)BitmapScan.

These hints are applicable for ordinary tables including unlogged or temporary ones, and system catalogues. Other objects, for example, external tables, table functions, Views, CTE, or subquery results are not covered.

Tweaking join method

The hint for join methods takes two or more table names or alias and will instruct planner to prefer the specific method for the join operation covering all of the tables. For example, HashJoin(a b c) tells the planner to prefer hash join for both of join operations on a and the result of sub join on b and c, or on c and the result of sub join on a and b. You can use (No)NestLoop, (No)HashJoin, (No)MergeJoin.

These hints are applicable for ordinary tables including unlogged or temporary ones, system catalogues; besides, also applicable for table functions, VALUES clauses, CTEs. Views and subquery results are not covered.

Tweaking join Order

Leading(a b c ..) takes a list of objects which are join hints are appliable to and suggests planner to do joins in that order. For this example, join a and b first, then join the result and c, and so on... In and out of each join operation cannot be tweaked in current version.

Temporary GUC setting

The Set hint has somewhat different syntax to others. It takes a pair of GUC parameter name and the value to be set, sets the value to the parameter and restore to the previous value when the planner finishes the work. So this hint has no effect on execution. The last hint in Set hints for the same GUC parameter is in effenct.

Only GUC parameter of Query Planning will have the effect. If Hint of multiple GUC parameters is specified for same GUC parameter, Hint specified in the last will be applicable.

You can use GUC of pg_hint_plan in 'Set' hint,but since it does not work as per expectation, it is recommended that not to specify. See also Restrictions.

GUC parameters for pg_hint_plan

GUC parameters to define the behaviors of pg_hint_plan are listed below.

namedescriptiondefault value
pg_hint_plan.enable_hint Enables or disables the all function of pg_hint_plan module.on
pg_hint_plan.debug_print Enables logging of pg_hint_plan's debugging informatin. The log records has the LOG LogLevel. off
pg_hint_plan.parse_messages Enables logging for parse failure of hints. The log level of the records are specified by this parameter. Valid values are the all loglevels smaller than fatal and null string.

Installation

Build

do 'make' in the base directory of pg_hint_plan then 'make install' as the same OS user as PostgreSQL. If you installed PostgreSQL by RPM, it should be installed from postgresq-devel* package.Example of build is given below

user$ tar xzvf pg_hint_plan-1.0.0.tar.gz
user$ cd pg_hint_plan-1.0.0
user$ make
user$ su postgres
posrgres$ make install

Loading pg_hint_plan

If you want to use pg_hint_plan only in current session, you can load it by LOAD command of PostgreSQL like the example session below. As described in PostgreSQL documentation, non-superusers can only apply LOAD to library files located in $libdir/plugins.

postgres=# LOAD 'pg_hint_plan';
LOAD

If you want to load pg_hint_plan automatically in every session, add 'pg_hint_plan' to shared_preload_libraries in postgresql.conf and restart the server.

You can see the typical setting in postgresql.conf below.

# postgresql.conf
shared_preload_libraries = 'pg_hint_plan'

custom_variable_classes = 'pg_hint_plan'    # Only for 9.1
pg_hint_plan.parse_messages = 'debug2'

When you use PostgreSQL 9.1 with pg_hint_plan, setting custom_variable_classes is required for the class pg_hint_plan in postgresql.conf.

Unistall

do 'make uninstall' in the same directory as installation and as the same user as PostgreSQL.

$ cd pg_hint_plan-1.0.0
$ su
# make uninstall

Examples

Tweaking scan methods.

In the example below, SeqScan(t1) tells that the user wants to scan t1 with sequential scan, and IndexScan(t2 t2_pkey) tells that index scan using t2_pkey is preferred to scan t2.

postgres=# /*+
postgres*#     SeqScan(t1)
postgres*#     IndexScan(t2 t2_pkey)
postgres*#  */
postgres-# SELECT * FROM table1 t1 JOIN table table2 t2 ON (t1.key = t2.key);
...

Tweaking join method and joining order.

NestLoop(t1 t2) tells the user wants that neted loop should be applied to the join operation covering t1 and t2, and MergeJoin(t1 t2 t3) tells merge join is preferred for the join operation covering all of the three tables t1, t2, t3, not the part of them. Leading(t1 t2 t3) tells that join operations should take place in the order even if the planner could find the cheaper plan doing joins in another order. The planner finally picks up the plan shown below.

postgres=# /*+
postgres*#     NestLoop(t1 t2)
postgres*#     MergeJoin(t1 t2 t3)
postgres*#     Leading(t1 t2 t3)
postgres*#  */
postgres-# SELECT * FROM table1 t1
postgres-#     JOIN table table2 t2 ON (t1.key = t2.key)
postgres-#     JOIN table table3 t3 ON (t2.key = t3.key);
...

Temporary GUC setting

Set(random_page_cost 2.0) tells that the planner gets 2.0 as the value of random_page_cost for planning the subsequent statements.

postgres=# /*+
postgres*#     Set(random_page_cost 2.0)
postgres*#  */
postgres-# SELECT * FROM table1 t1 WHERE key = 'value';
...

Restrictions

This section describes the miscellaneous restrictions on scripting hints.

Position of the hints
pg_hint_plan reads only the first block commnet in the query string. So no hint written after that has any effect for pg_hint_plan. In the following example, HashJoin and SeqScan is in effect but IndexScan is not.

postgres=# /*+
postgres*#    HashJoin(a b)
postgres*#    SeqScan(a)
postgres*#  */
postgres-# /*+ IndexScan(a) */
postgres-# EXPLAIN SELECT /*+ MergeJoin(a b) */ *
postgres-#    FROM pgbench_branches b
postgres-#    JOIN pgbench_accounts a ON b.bid = a.bid
postgres-#   ORDER BY a.aid;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Sort  (cost=31465.84..31715.84 rows=100000 width=197)
   Sort Key: a.aid
   ->  Hash Join  (cost=1.02..4016.02 rows=100000 width=197)
         Hash Cond: (a.bid = b.bid)
         ->  Seq Scan on pgbench_accounts a  (cost=0.00..2640.00 rows=100000 width=97)
         ->  Hash  (cost=1.01..1.01 rows=1 width=100)
               ->  Seq Scan on pgbench_branches b  (cost=0.00..1.01 rows=1 width=100)
(7 rows)

postgres=# 
Object names in quotes
Following the PostgreSQL's lexical structure, names can be quoted in order to contain some special characters in pg_hint_plan.
Distinguishing multiple tables with same name.
pg_hint_plan identifies objects by only its names, not the qualified names. So pg_hint_plan can not distinguish between tables with the same name in multiple schemas. In these cases, aliasing the ambiguous tables should help. In the first example below, pg_hint_plan has found multiple candidates for the given name 't1' so logs that and the hint is ignored. In the next example, pg_hint_plan could distinguish the two tables since they are given as diferrent alias names.

postgres=# /*+ HashJoin(t1 t1)*/
postgres-# EXPLAIN SELECT * FROM s1.t1
postgres-# JOIN public.t1 ON (s1.t1.id=public.t1.id);
INFO:  hint syntax error at or near "HashJoin(t1 t1)"
DETAIL:  Relation name "t1" is ambiguous.
                            QUERY PLAN
------------------------------------------------------------------
 Merge Join  (cost=337.49..781.49 rows=28800 width=8)
   Merge Cond: (s1.t1.id = public.t1.id)
   ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
         Sort Key: s1.t1.id
         ->  Seq Scan on t1  (cost=0.00..34.00 rows=2400 width=4)
   ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
         Sort Key: public.t1.id
         ->  Seq Scan on t1  (cost=0.00..34.00 rows=2400 width=4)
(8 行)

postgres=# /*+ HashJoin(pt st) */
postgres-# EXPLAIN SELECT * FROM s1.t1 st
postgres-# JOIN public.t1 pt ON (st.id=pt.id);
                             QUERY PLAN
---------------------------------------------------------------------
 Hash Join  (cost=64.00..1112.00 rows=28800 width=8)
   Hash Cond: (st.id = pt.id)
   ->  Seq Scan on t1 st  (cost=0.00..34.00 rows=2400 width=4)
   ->  Hash  (cost=34.00..34.00 rows=2400 width=4)
         ->  Seq Scan on t1 pt  (cost=0.00..34.00 rows=2400 width=4)
(5 行)

postgres=#

Limitations of VALUES Expressions in FROM
Specify "*VALUES*" as a object name to tweak any method on a result of VALUES Expression in FROM clause. Because alias name for VALUES result is substituted to "*VALUES*" by PostgreSQL, So when using multiple VALUES, target of the hint cannot be specified ,so execution plan cannot be controlled.

Specifying Target objects of Hint

Tables not clarified in query as that in views or in SQL functions
Since pg_hint_plan identifies objects by the names or aliases the planner sees, the names in views or quiries in SQL functions appear in the target query are also valid as identifiers. So, within the whole extent of the query where the planner sees at once, the given name is regarded as ambiguous when it refers diferrent objects in the portions implicitly included into the query, say, views or SQL functions. Conversely, multiple views with the same logical definition with the diferrent alias sets given can be applied diferrent hints using the aliases.
In the first of the following examples, alias 't1' is refers both table1 in view1 and same table1 in the main query. So t1 is not ambiguous and SeqScan(t1) was applied to both the view and the main query. In the second expample, table1 in the main query aliased as 't3' which is different to 't1' in the view. So SeqScan(t3) is applied only to the table1 in main query.
postgres=# CREATE VIEW view1 AS SELECT * FROM table1 t1;
CREATE TABLE
postgres=# /*+ SeqScan(t1) */
postgres=# EXPLAIN SELECT * FROM table1 t1 JOIN view1 t2 ON (t1.key = t2.key) WHERE t2.key = 1;
                           QUERY PLAN
-----------------------------------------------------------------
 Nested Loop  (cost=0.00..358.01 rows=1 width=16)
   ->  Seq Scan on table1 t1  (cost=0.00..179.00 rows=1 width=8)
         Filter: (key = 1)
   ->  Seq Scan on table1 t1  (cost=0.00..179.00 rows=1 width=8)
         Filter: (key = 1)
(5 rows)

postgres=# /*+ SeqScan(t3) */
postgres=# EXPLAIN SELECT * FROM table1 t3 JOIN view1 t2 ON (t1.key = t2.key) WHERE t2.key = 1;
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..187.29 rows=1 width=16)
   ->  Seq Scan on table1 t3  (cost=0.00..179.00 rows=1 width=8)
         Filter: (key = 1)
   ->  Index Scan using foo_pkey on table1 t1  (cost=0.00..8.28 rows=1 width=8)
         Index Cond: (key = 1)
(5 rows)

Applying hints to child tables.
pg_hint_plan is conscious of inheritances. The hints about scans on the parent table is also applied automatically on the childs if it is applicable.Now , It's only for table-name.
Hint scope for RULE
The statements triggered by RULEs share the same hints with the triggering query.
Hint scope for multi statement
Each queries in multi statement share the same hints in the first block commnet before the first queriy. Hint notations in other places are ignored.
Applicable scope of hint in multi statement
When query is implemented in multi statements, hint specified in the block comment at the beginning is used in all queries. Hint specified from 2nd query onwards is ignored. It is executed in multi statement when multiple queries is specified in  c option in psql command
IndexOnlyScan hint (PostgreSQL 9.2 onwards)
IndexOnlyScan for the tables which has multiple indexes and without explicit index specification sometimes failes to force index only scan since the planner decided to use another index on which index only scan cannot be performed. Explicit index specification could help the case
NoIndexScan on PostgreSQL 9.2 or the newer.
NoIndexScan inhibits both index scan and index only scan on PostgrteSQL 9.2 or the newer.

Handling parse errors of hints

Syntax error
pg_hint_plan immediately stops hint parsing on parse error and then runs the query body applying the hints successfully parsed so far. The details of the error is logged in server log which has the error level in pg_hint_plan.parse_messages. Typical syntax errors follows,
Object not found
pg_hint_plan skips the hint and to parse the next hint on the object in the hint is not found. The details of the error is logged. Example below.
Conflicting between the hints with same targets.
If two or more hints in conflicting types occurs with same target list, the last one is in effect.
Nested comment in the comment for hints
pg_hint_plan cease to parse the hint comment when encountering the start of another block comment, abandon all hint information and then run the query body without applying any hint. The details of the error is logged.
Log level for pg_hint_plan.
pg_hint_plan puts almost informations into system log with the log level in the GUC pg_hint_plan.parse_messages. Only the messages informing overlength of the database identifiers (max 63 bytes) will always logged as NOTICE.

Functional limitations

Limitaion by GUC parameters
The hints for JOIN are ignored when the length of the target list of the hints exceeding from_collapse_limit or join_collapse_limit.
The cases when hints does not work as expected.
Every hint could not be in effect because of the nature of the planner's calculations. For instance, the following reasons might be guessed.
Limitations in PL/pgSQL
Hints cannot be used in PL/pgSQL because the comments are not passed to planner on execution.
Limitations in ECPG
Hints cannot be used also in ECPG because the comments are not passed to the server.
Specify fetch psql count
psql attaches DECLARE statement to the top of existing query string when the FETCH_COUNT variable is a positive value . So, the hints seems dissapearing for pg_hint_plan.
Difference in the fingerprint calculation
On PosrgreSQL 9.1, query cache key is calculated including comments, so same queries with different hints spoils it. But 9.1 and after calculates query cache key excluding comments so the query cache can work effectively for the same queries with different hints.
postgres=# /*+ MergeJoin(a *VALUES*) */
postgres-# EXPLAIN SELECT *
postgres-#    FROM pgbench_accounts a
postgres-#    JOIN (VALUES (1,1),(2,2)) v (vid, vbalance) ON a.aid = v.vid
postgres-#   ORDER BY a.aid;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=0.04..4497.33 rows=2 width=105)
   Merge Cond: (a.aid = "*VALUES*".column1)
   ->  Index Scan using pgbench_accounts_pkey on pgbench_accounts a  (cost=0.00..4247.26 rows=100000 width=97)
   ->  Sort  (cost=0.04..0.04 rows=2 width=8)
         Sort Key: "*VALUES*".column1
         ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=8)
(6 rows)

postgres=#
Limitations of Set Hint
Although you can put the Set hint for GUC parameters such like 'pg_hint_plan' which has no effect for planner behavior, you will find it doesn't work in your favor. So it is discuraged without specific intention and knowledge. Actual actions when specified are shown below.

requirement

PostgreSQL
Version 9.1.4、9.2.1
OS
RHEL 6.1

See also

PostgreSQL documents

EXPLAIN SET Server Config