PostgreSQL query against millions of rows takes long on UUIDs

Postgredaxiang

23人浏览 · 2022-09-22 05:52:33

Postgredaxiang · 2022-09-22 05:52:33 发布

Answer a question

I have a reference table for UUIDs that is roughly 200M rows. I have ~5000 UUIDs that I want to look up in the reference table. Reference table looks like:

CREATE TABLE object_store AS (
    project_id UUID,
    object_id UUID,
    object_name VARCHAR(20),
    description VARCHAR(80)
);

CREATE INDEX object_store_project_idx ON object_store(project_id);
CREATE INDEX object_store_id_idx ON object_store(object_id);

* Edit #2 *

Request for the temp_objects table definition.

CREATE TEMPORARY TABLE temp_objects AS (
    object_id UUID
)
ON COMMIT DELETE ROWS;

The reason for the separate index is because object_id is not unique, and can belong to many different projects. The reference table is just a temp table of UUIDs (temp_objects) that I want to check (5000 object_ids).

If I query the above reference table with 1 object_id literal value, it's almost instantaneous (2ms). If the temp table only has 1 row, again, instantaneous (2ms). But with 5000 rows it takes 25 minutes to even return. Granted it pulls back >3M rows of matches.

* Edited *

This is for 1 row comparison (4.198 ms):

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)SELECT O.project_id
FROM temp_objects T JOIN object_store O ON T.object_id = O.object_id;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.57..475780.22 rows=494005 width=65) (actual time=0.038..2.631 rows=1194 loops=1)
   Buffers: shared hit=1202, local hit=1
   ->  Seq Scan on temp_objects t  (cost=0.00..13.60 rows=360 width=16) (actual time=0.007..0.009 rows=1 loops=1)
         Buffers: local hit=1
   ->  Index Scan using object_store_id_idx on object_store l  (cost=0.57..1307.85 rows=1372 width=81) (actual time=0.027..1.707 rows=1194 loops=1)
         Index Cond: (object_id = t.object_id)
         Buffers: shared hit=1202
 Planning time: 0.173 ms
 Execution time: 3.096 ms
(9 rows)

Time: 4.198 ms

This is for 4911 row comparison (1579082.974 ms (26:19.083)):

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)SELECT O.project_id
FROM temp_objects T JOIN object_store O ON T.object_id = O.object_id;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.57..3217316.86 rows=3507438 width=65) (actual time=0.041..1576913.100 rows=8043500 loops=1)
   Buffers: shared hit=5185078 read=2887548, local hit=71
   ->  Seq Scan on temp_objects d  (cost=0.00..96.56 rows=2556 width=16) (actual time=0.009..3.945 rows=4911 loops=1)
         Buffers: local hit=71
   ->  Index Scan using object_store_id_idx on object_store l  (cost=0.57..1244.97 rows=1372 width=81) (actual time=1.492..320.081 rows=1638 loops=4911)
         Index Cond: (object_id = t.object_id)
         Buffers: shared hit=5185078 read=2887548
 Planning time: 0.169 ms
 Execution time: 1579078.811 ms
(9 rows)

Time: 1579082.974 ms (26:19.083)

Eventually I want to group and get a count of the matching object_ids by project_id, using standard grouping. The aggregate is at the upper end (of course) of the cost. It took just about 25 minutes again to complete the below query. Yet, when I limit the temp table to only 1 row, it comes back in 21ms. Something is not adding up...

EXPLAIN SELECT O.project_id, count(*)
FROM temp_objects T JOIN object_store O ON T.object_id = O.object_id GROUP BY O.project_id;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=6189484.10..6189682.84 rows=19874 width=73)
   Group Key: o.project_id
   ->  Nested Loop  (cost=0.57..6155795.69 rows=6737683 width=65)
         ->  Seq Scan on temp_objects t  (cost=0.00..120.10 rows=4910 width=16)
         ->  Index Scan using object_store_id_idx on object_store o  (cost=0.57..1239.98 rows=1372 width=81)
               Index Cond: (object_id = t.object_id)
(6 rows)

I'm on PostgreSQL 10.6, running 2 CPUs and 8GB of RAM on an SSD. I have ANALYZEd the tables, I have set the work_mem to 50MB, shared_buffers to 2GB, and have set the random_page_cost to 1. All helped the queries actually to come back in several minutes, but still not as fast as I feel it should be.

I have the option to go to cloud computing if CPUs/RAM/parallelization make a big difference. Just looking for suggestions on how to get this simple query to return in < few seconds (if possible).

* UPDATE *

Taking the hint from Jürgen Zornig, I changed both object_id fields to be bigint, using just the top half of the UUID and reducing my datasize by half. Doing the aggregate query above the query now performs at ~16min.

Next, taking jjane's suggestion of set enable_nestloop to off, my aggregate query jumped to 6min! Unfortunately, all the other suggestions haven't sped it up past 6min, although it's interesting that changing my "TEMPORARY" table to a permanent one allowed 2 workers to work it, it didn't change the time. I think jjane is accurate by saying the IO is the binding factor here. Here is the latest explain plan from the 6min (wish it were faster, still, but it's better!):

explain (analyze, buffers, format text) select project_id, count(*) from object_store natural join temp_object group by project_id;
                                                                                         QUERY PLAN                                                                                         
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=3966899.86..3967396.69 rows=19873 width=73) (actual time=368124.126..368744.157 rows=153633 loops=1)
   Group Key: object_store.project_id
   Buffers: shared hit=243022 read=2423215, temp read=3674 written=3687
   I/O Timings: read=870720.440
   ->  Sort  (cost=3966899.86..3966999.23 rows=39746 width=73) (actual time=368124.116..368586.497 rows=333427 loops=1)
         Sort Key: object_store.project_id
         Sort Method: external merge  Disk: 29392kB
         Buffers: shared hit=243022 read=2423215, temp read=3674 written=3687
         I/O Timings: read=870720.440
         ->  Gather  (cost=3959690.23..3963863.56 rows=39746 width=73) (actual time=366476.369..366827.313 rows=333427 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               Buffers: shared hit=243022 read=2423215
               I/O Timings: read=870720.440
               ->  Partial HashAggregate  (cost=3958690.23..3958888.96 rows=19873 width=73) (actual time=366472.712..366568.313 rows=111142 loops=3)
                     Group Key: object_store.project_id
                     Buffers: shared hit=243022 read=2423215
                     I/O Timings: read=870720.440
                     ->  Hash Join  (cost=132.50..3944473.09 rows=2843429 width=65) (actual time=7.880..363848.830 rows=2681167 loops=3)
                           Hash Cond: (object_store.object_id = temp_object.object_id)
                           Buffers: shared hit=243022 read=2423215
                           I/O Timings: read=870720.440
                           ->  Parallel Seq Scan on object_store  (cost=0.00..3499320.53 rows=83317153 width=73) (actual time=0.467..324932.880 rows=66653718 loops=3)
                                 Buffers: shared hit=242934 read=2423215
                                 I/O Timings: read=870720.440
                           ->  Hash  (cost=71.11..71.11 rows=4911 width=8) (actual time=7.349..7.349 rows=4911 loops=3)
                                 Buckets: 8192  Batches: 1  Memory Usage: 256kB
                                 Buffers: shared hit=66
                                 ->  Seq Scan on temp_object  (cost=0.00..71.11 rows=4911 width=8) (actual time=0.014..2.101 rows=4911 loops=3)
                                       Buffers: shared hit=66
 Planning time: 0.247 ms
 Execution time: 368779.757 ms
(32 rows)

Time: 368780.532 ms (06:08.781)

So I'm at 6min per query now. I think with I/O costs, I may try for an in-memory store on this table if possible to see if getting it off SSD makes it even better.

Answers

UUIDs are (EDIT) working against adaptive cache management and, because of their random nature effectively dropping the cache hit ratio because the index space is larger than memory. Ids cover a numerically wide range equally distributed, so in fact every Id lands pretty much on its own leaf on the index tree. As the index leaf determines in which data page the row is saved in disk pretty much every row gets its own page resulting in a whole lot of extremely expensive I/O Operations to get all these rows read in.

That's the reason why its generally not recommended to use UUIDs and if you really need UUIDs then at least generate timestamp/mac-prefixed UUIDs (have a look at uuid_generate_v1() - https://www.postgresql.org/docs/9.4/uuid-ossp.html) that are numerically close to each other, therefore chances are higher that data rows are clustered together on lesser data Pages resulting in fewer I/O Operations to get more Data in.

Long Story Short: Randomness over a large range kills your index (well actually not the index, it results in a lot of expensive I/O to get data on reading and to maintain the index on writing) and therefore slows queries down to a point where it is as good as having no index at all.

Here is also an article for reference

PostgreSQL

PostgreSQL社区为您提供最前沿的新闻资讯和知识内容

更多推荐

PostgreSQL 计数查询效率,物化视图 [重复]

问题:PostgreSQL 计数查询效率,物化视图 [重复] 可能重复: PostgreSQL 计数查询优化使用 PostgreSQL 9.2,我们试图弄清楚是否有一种方法可以跟踪查询的结果数量,并以有效的方式返回该数字。这个查询应该每秒执行几次(可能几十到几百甚至几千次)。我们现在的查询看起来像这样,但我们想知道这是否效率低下: -- Get # of rows that do not hav

PostgreSQL

多对多中的唯一性

问题:多对多中的唯一性我无法弄清楚谷歌的哪些术语,所以帮助标记这个问题或只是以相关问题的方式向我指出会有所帮助。我相信我有一个典型的多对多关系: CREATE TABLE groups ( id integer PRIMARY KEY); CREATE TABLE elements ( id integer PRIMARY KEY); CREATE TABLE groups_elements

PostgreSQL

Django 与 postgresql - manage.py syncdb 返回错误

问题:Django 与 postgresql - manage.py syncdb 返回错误我从 Django 开始。我设置了一些使用 SQLite 工作的站点,但是在将 DB 引擎更改为 postgresql manage.py syncdb 后返回错误。我已经用谷歌搜索了 2 天,但对我仍然没有任何作用。Postgres 用户 'joe' 具有超级用户权限和本地 'joe ' 数据库存在。