os: centos 7.4
db: postgresql 10.11

版本

# cat /etc/centos-release
CentOS Linux release 7.4.1708 (Core) 
# 
# su - postgres
$
$ psql -c "select version();"
                                                 version                                                  
----------------------------------------------------------------------------------------------------------
 PostgreSQL 10.11 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)

create table

$ psql
psql (10.11)
Type "help" for help.

postgres=# \timing on
Timing is on.
postgres=# 
postgres=# drop table if exists tmp_t4;
DROP TABLE
Time: 481.821 ms

postgres=# drop table if exists tmp_t5;
DROP TABLE
Time: 451.315 ms

postgres=# create table tmp_t4( 
id    int8,
name  varchar(100),
memo1 varchar(100),
memo2 varchar(100)
);
CREATE TABLE
Time: 154.448 ms

postgres=# create table tmp_t5( 
id    int8,
name  varchar(100),
memo1 varchar(100),
memo2 varchar(100)
);
CREATE TABLE
Time: 318.358 ms

postgres=# insert into tmp_t4 
select id,
       md5(id::varchar),
       md5(id::varchar),
	   null 
  from generate_series(1,1000000) as id;

postgres=# insert into tmp_t5 
select id,
       md5(id::varchar),
       md5(id::varchar),
	   null 
  from generate_series(1,1000000) as id;

postgres=# create index idx_tmp_t4_name on tmp_t4 using btree (id,name);
create index idx_tmp_t5_name on tmp_t5 using btree (id,name);

postgres=# analyze tmp_t4;
analyze tmp_t5;

explain

postgres=# set max_parallel_workers_per_gather=0;

postgres=# explain verbose 
select t4.*,t5.*
  from tmp_t4 t4,
       tmp_t5 t5
 where 1=1
   and t4.name = t5.name
   and t4.id <= 10000
;

                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Hash Join  (cost=14049.73..41244.53 rows=11080 width=584)
   Output: t4.id, t4.name, t4.memo1, t4.memo2, t5.id, t5.name, t5.memo1, t5.memo2
   Hash Cond: ((t5.name)::text = (t4.name)::text)
   ->  Seq Scan on public.tmp_t5 t5  (cost=0.00..23334.00 rows=1000000 width=292)
         Output: t5.id, t5.name, t5.memo1, t5.memo2
   ->  Hash  (cost=13911.23..13911.23 rows=11080 width=292)
         Output: t4.id, t4.name, t4.memo1, t4.memo2
         ->  Bitmap Heap Scan on public.tmp_t4 t4  (cost=454.30..13911.23 rows=11080 width=292)
               Output: t4.id, t4.name, t4.memo1, t4.memo2
               Recheck Cond: (t4.id <= 10000)
               ->  Bitmap Index Scan on idx_tmp_t4_name  (cost=0.00..451.53 rows=11080 width=0)
                     Index Cond: (t4.id <= 10000)
(12 rows)

Time: 0.800 ms
   

发现对 tmp_t4 执行
Bitmap Heap Scan on public.tmp_t4
Bitmap Index Scan on idx_tmp_t4_name

Bitmap Index Scan

Instead of using amgettuple, an index scan can be done with amgetbitmap to fetch all tuples in one call. This can be noticeably more efficient than amgettuple because it allows avoiding lock/unlock cycles within the access method. In principle amgetbitmap should have the same effects as repeated amgettuple calls, but we impose several restrictions to simplify matters.
First of all, amgetbitmap returns all tuples at once and marking or restoring scan positions isn’t supported.
Secondly, the tuples are returned in a bitmap which doesn’t have any specific ordering, which is why amgetbitmap doesn’t take a direction argument. (Ordering operators will never be supplied for such a scan, either.)
Also, there is no provision for index-only scans with amgetbitmap, since there is no way to return the contents of index tuples. Finally, amgetbitmap does not guarantee any locking of the returned tuples, with implications spelled out in Section 60.4.

Note that it is permitted for an access method to implement only amgetbitmap and not amgettuple, or vice versa, if its internal implementation is unsuited to one API or the other.

Bitmap Index Scan 可以理解为一次性将符合条件的 tuple-pointers 全部取出来,然后在内存中进行地址排序,然后去取出数据,由于进行的地址排序,读取时就变成了顺序的读,比较适合获取少量数据
而 Index Scan 为拿到索引的地址信息后,马上回表获取数据,属于随机读,比较适合获取更少量数据
默认设置为 on。

postgres=# select name,setting from pg_settings where name like 'enable%' order by name;
         name         | setting 
----------------------+---------
 enable_bitmapscan    | on
 enable_gathermerge   | on
 enable_hashagg       | on
 enable_hashjoin      | on
 enable_indexonlyscan | on
 enable_indexscan     | on
 enable_material      | on
 enable_mergejoin     | on
 enable_nestloop      | on
 enable_seqscan       | on
 enable_sort          | on
 enable_tidscan       | on
(12 rows)

Time: 2.628 ms

Bitmap Heap Scan

这个和 Bitmap Index Scan 配合,为批量回表操作


参考:
https://www.postgresql.org/docs/10/index-scanning.html

更多推荐