金仓数据库KingbaseES巧用dblink 实现多进程并行查询

对于分区表的大数据统计分析，由于数据量巨大，往往需要采用并行。但是数据库并行的效率相比分进程分表统计还是有比较大的差距。本文通过巧用dblink，实现分进程分分区统计数据。

沉舟侧畔千帆过_

184人浏览 · 2023-04-26 19:50:30

沉舟侧畔千帆过_ · 2023-04-26 19:50:30 发布

概述

例子

kingbase=# \\d t751

                      分区表 "public.t751"

 栏位 |            类型             | 校对规则 |  可空的  | 预设

\------+-----------------------------+----------+----------+------

 id   | bigint                      |          | not null |

 code | text                        |          |          |

 c1   | timestamp without time zone |          |          |

 c2   | text                        |          |          |

 c3   | numeric                     |          |          |

 c4   | integer                     |          |          |

 c5   | integer                     |          |          |

 c6   | integer                     |          |          |

 v2   | numeric                     |          |          |

 v3   | timestamp without time zone |          |          |

分区键值: HASH (c4, c5, c6)

分区的数量：1000(可以使用 \\d+ 来列出它们)

表空间："nvmtbs01"

kingbase=# select pg\_size\_pretty(sum(pg\_relation\_size(relid))) from pg\_partition\_tree('t751');

 pg\_size\_pretty

\----------------

 20 GB

并行方式访问

kingbase=# explain analyze  select  c4, c5, c6, sum(v2) v2 from t751 group by c4, c5, c6;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=5656234.57..14706860.46 rows=8000000 width=44) (actual time=15764.781..19110.336 rows=11781 loops=1)
   Group Key: t751_872.c4, t751_872.c5, t751_872.c6
   ->  Gather Merge  (cost=5656234.57..13806860.46 rows=64000000 width=44) (actual time=15764.142..19103.239 rows=11902 loops=1)
         Workers Planned: 8
         Workers Launched: 7
         ->  Partial GroupAggregate  (cost=5655234.43..5911484.32 rows=8000000 width=44) (actual time=15478.852..18479.480 rows=1488 loops=8)
               Group Key: t751_872.c4, t751_872.c5, t751_872.c6
               ->  Sort  (cost=5655234.43..5686484.41 rows=12499991 width=22) (actual time=15478.239..17066.239 rows=12500000 loops=8)
                     Sort Key: t751_872.c4, t751_872.c5, t751_872.c6
                     Sort Method: quicksort  Memory: 1408511kB
                     Worker 0:  Sort Method: quicksort  Memory: 1371267kB
                     Worker 1:  Sort Method: quicksort  Memory: 1364497kB
                     Worker 2:  Sort Method: quicksort  Memory: 1374769kB
                     Worker 3:  Sort Method: quicksort  Memory: 1355381kB
                     Worker 4:  Sort Method: quicksort  Memory: 1352989kB
                     Worker 5:  Sort Method: quicksort  Memory: 1343974kB
                     Worker 6:  Sort Method: quicksort  Memory: 1386845kB
                     ->  Parallel Append  (cost=0.00..3156369.51 rows=12499991 width=22) (actual time=0.034..5045.596 rows=12500000 loops=8)
                           ->  Parallel Seq Scan on t751_872  (cost=0.00..5849.56 rows=81756 width=22) (actual time=0.025..60.070 rows=196215 loops=1)
                           ->  Parallel Seq Scan on t751_470  (cost=0.00..5753.10 rows=80410 width=22) (actual time=0.068..73.100 rows=192984 loops=1)
...
                           ->  Parallel Seq Scan on t751_416  (cost=0.00..556.80 rows=10380 width=22) (actual time=0.009..5.736 rows=17646 loops=1)
                           ->  Parallel Seq Scan on t751_885  (cost=0.00..315.77 rows=5877 width=22) (actual time=0.010..2.460 rows=9991 loops=1)
 Planning Time: 26.450 ms
 Execution Time: 19180.545 ms

注意：这里实际采用的是groupagg ，如果采用hashagg，执行效率会高很多

使用dblink方式

创建函数

create or replace function f_t_part(para int default 0)
    returns table
            (
                c4 int,
                c5 int,
                c6 int,
                v2 numeric
            )
    language plpgsql
    immutable
    parallel safe
as
$$
declare
    vtabnam text;
    atabnam text[];
begin
    perform dblink_disconnect(conn)
    from unnest(dblink_get_connections()) conn;

    perform dblink_connect('conn_' || sn, 'dbname=' || current_database())
    from generate_series(1, para) sn;

    if para <= 0 then
        for vtabnam in select relid from pg_partition_tree('t751')
            loop
                for c4,c5,c6,v2 in execute format('select c4, c5, c6, sum(v2) v2 from only %s group by c4, c5, c6 ;',vtabnam)
                    loop
                        return next ;
                    end loop;
            end loop;
    else
        for atabnam in select string_to_array(string_agg(relid, ','), ',') relid
                       from pg_partition_tree('t751') with ordinality
                               , lateral ( select ordinality / para sn)
                       where isleaf
                       group by sn
            loop

                perform dblink_is_busy(conn) = 0 from unnest(dblink_get_connections()) conn;

                perform conn,tabnam, dblink_send_query(conn,format('select c4, c5, c6, sum(v2) v2 from only %s group by c4, c5, c6 ;',tabnam)) sq
                from (select unnest(dblink_get_connections()) conn, unnest(atabnam) tabnam) v
                where tabnam is not null;

                for c4,c5,c6,v2 in
                    select tab.* from unnest(dblink_get_connections()) conn , dblink_get_result(conn) as tab(c4 int, c5 int, c6 int, v2 numeric)
                    loop
                        return next ;
                    end loop;
            end loop;
    end if;

    perform dblink_disconnect(conn) from unnest(dblink_get_connections()) conn;
    return;
end;
$$;

执行效率

ingbase=# explain analyze select * from ft751_01();
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Function Scan on ft751_01  (cost=0.25..10.25 rows=1000 width=44) (actual time=32738.962..32739.288 rows=11781 loops=1)
 Planning Time: 0.055 ms
 Execution Time: 32741.655 ms
(3 行记录)

kingbase=# explain analyze ^Jselect * from ft751_01(16);
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Function Scan on ft751_01  (cost=0.25..10.25 rows=1000 width=44) (actual time=5397.644..5398.189 rows=11781 loops=1)
 Planning Time: 0.044 ms
 Execution Time: 5398.887 ms