在之前的帖子我已经在 AWS m6gd.2xlarge(ARM Graviton2 处理器)上运行 PostgreSQL。

我没有精确编译选项,这篇文章将在此反馈之后提供更多详细信息:

https://twitter.com/N_B__N_B/status/136918884608315398

首先,PostgreSQL ./configure 已正确检测到 ARM 并使用以下标志编译:-marchu003darmv8-a+crc

这是 ARM v8。但是,用于原子指令的 LSE(大型系统扩展)后来在 ARM v8.1 中添加,它们可以在 PostgreSQL 上产生巨大的差异,尤其是在 CPU 使用率高的情况下使用自旋锁。

我按照https://github.com/aws/aws-graviton-getting-started/blob/master/c-c++.md中的信息检查编译后的二进制文件。


for i in $(find postgres/src/backend -name "*.o") ; do objdump -d "$i" | awk '/:$/{w=$2}/aarch64_(cas|casp|swp|ldadd|stadd|ldclr|stclr|ldeor|steor|ldset|stset|ldsmax|stsmax|ldsmin|stsmin|ldumax|stumax|ldumin|stumin)/{printf "%-27s %-20s %-30s %-60s\n","(LSE instructions)",$NF,w,f}' f="$i" ; done | sort | uniq -c | sort -rnk1,4


      8 (LSE instructions)          <__aarch64_swp4_acq> <StartupXLOG>:                 postgres/src/backend/access/transam/xlog.o
      7 (LSE instructions)          <__aarch64_swp4_acq> <BitmapHeapNext>:              postgres/src/backend/executor/nodeBitmapHeapscan.o
      6 (LSE instructions)          <__aarch64_ldclr4_acq_rel> <LWLockDequeueSelf>:           postgres/src/backend/storage/lmgr/lwlock.o
      6 (LSE instructions)          <__aarch64_cas8_acq_rel> <shm_mq_send_bytes>:           postgres/src/backend/storage/ipc/shm_mq.o
      5 (LSE instructions)          <__aarch64_swp4_acq> <WalReceiverMain>:             postgres/src/backend/replication/walreceiver.o
      5 (LSE instructions)          <__aarch64_cas8_acq_rel> <shm_mq_receive_bytes.isra.0>: postgres/src/backend/storage/ipc/shm_mq.o
      4 (LSE instructions)          <__aarch64_swp4_acq> <ProcessRepliesIfAny>:         postgres/src/backend/replication/walsender.o
      4 (LSE instructions)          <__aarch64_swp4_acq> <hash_search_with_hash_value>: postgres/src/backend/utils/hash/dynahash.o
      4 (LSE instructions)          <__aarch64_swp4_acq> <copy_replication_slot>:       postgres/src/backend/replication/slotfuncs.o
      4 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <parallel_vacuum_index>:       postgres/src/backend/access/heap/vacuumlazy.o
      4 (LSE instructions)          <__aarch64_cas4_acq_rel> <LWLockAcquire>:               postgres/src/backend/storage/lmgr/lwlock.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <xlog_redo>:                   postgres/src/backend/access/transam/xlog.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <XLogInsertRecord>:            postgres/src/backend/access/transam/xlog.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <SaveSlotToPath>:              postgres/src/backend/replication/slot.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <RequestCheckpoint>:           postgres/src/backend/postmaster/checkpointer.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <LogicalRepSyncTableStart>:    postgres/src/backend/replication/logical/tablesync.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <LogicalConfirmReceivedLocation>: postgres/src/backend/replication/logical/logical.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <InvalidateObsoleteReplicationSlots>: postgres/src/backend/replication/slot.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <CreateInitDecodingContext>:   postgres/src/backend/replication/logical/logical.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <CreateCheckPoint>:            postgres/src/backend/access/transam/xlog.o
      3 (LSE instructions)          <__aarch64_swp4_acq> <CheckpointerMain>:            postgres/src/backend/postmaster/checkpointer.o
      3 (LSE instructions)          <__aarch64_ldclr4_acq_rel> <LWLockQueueSelf>:             postgres/src/backend/storage/lmgr/lwlock.o
      3 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <tbm_prepare_shared_iterate>:  postgres/src/backend/nodes/tidbitmap.o
      3 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <tbm_free_shared_area>:        postgres/src/backend/nodes/tidbitmap.o
      3 (LSE instructions)          <__aarch64_cas8_acq_rel> <ProcessProcSignalBarrier>:    postgres/src/backend/storage/ipc/procsignal.o
      3 (LSE instructions)          <__aarch64_cas8_acq_rel> <ExecParallelHashIncreaseNumBatches>: postgres/src/backend/executor/nodeHash.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <XLogWrite>:                   postgres/src/backend/access/transam/xlog.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <XLogSendPhysical>:            postgres/src/backend/replication/walsender.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <XLogBackgroundFlush>:         postgres/src/backend/access/transam/xlog.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <WalRcvStreaming>:             postgres/src/backend/replication/walreceiverfuncs.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <WalRcvRunning>:               postgres/src/backend/replication/walreceiverfuncs.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <WalRcvDie>:                   postgres/src/backend/replication/walreceiver.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <TransactionIdLimitedForOldSnapshots>: postgres/src/backend/utils/time/snapmgr.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <StrategyGetBuffer>:           postgres/src/backend/storage/buffer/freelist.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <shm_mq_wait_internal>:        postgres/src/backend/storage/ipc/shm_mq.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotReserveWal>:   postgres/src/backend/replication/slot.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotRelease>:      postgres/src/backend/replication/slot.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <ProcKill>:                    postgres/src/backend/storage/lmgr/proc.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <process_syncing_tables>:      postgres/src/backend/replication/logical/tablesync.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <pg_get_replication_slots>:    postgres/src/backend/replication/slotfuncs.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <exec_replication_command>:    postgres/src/backend/replication/walsender.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <CreateRestartPoint>:          postgres/src/backend/access/transam/xlog.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <ConditionVariableBroadcast>:  postgres/src/backend/storage/lmgr/condition_variable.o
      2 (LSE instructions)          <__aarch64_swp4_acq> <BarrierArriveAndWait>:        postgres/src/backend/storage/ipc/barrier.o
      2 (LSE instructions)          <__aarch64_ldset4_acq_rel> <LWLockWaitListLock>:          postgres/src/backend/storage/lmgr/lwlock.o
      2 (LSE instructions)          <__aarch64_ldclr4_acq_rel> <LWLockWaitForVar>:            postgres/src/backend/storage/lmgr/lwlock.o
      2 (LSE instructions)          <__aarch64_ldclr4_acq_rel> <LWLockUpdateVar>:             postgres/src/backend/storage/lmgr/lwlock.o
      2 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <vacuum_delay_point>:          postgres/src/backend/commands/vacuum.o
      2 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <StrategyGetBuffer>:           postgres/src/backend/storage/buffer/freelist.o
      2 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <LWLockRelease>:               postgres/src/backend/storage/lmgr/lwlock.o
      2 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <lazy_parallel_vacuum_indexes>: postgres/src/backend/access/heap/vacuumlazy.o
      2 (LSE instructions)          <__aarch64_cas8_acq_rel> <WalReceiverMain>:             postgres/src/backend/replication/walreceiver.o
      2 (LSE instructions)          <__aarch64_cas8_acq_rel> <WaitForProcSignalBarrier>:    postgres/src/backend/storage/ipc/procsignal.o
      2 (LSE instructions)          <__aarch64_cas8_acq_rel> <shm_mq_receive>:              postgres/src/backend/storage/ipc/shm_mq.o
      2 (LSE instructions)          <__aarch64_cas8_acq_rel> <ResolveRecoveryConflictWithLock>: postgres/src/backend/storage/ipc/standby.o
      2 (LSE instructions)          <__aarch64_cas8_acq_rel> <ProcSignalInit>:              postgres/src/backend/storage/ipc/procsignal.o
      2 (LSE instructions)          <__aarch64_cas8_acq_rel> <ExecParallelHashTableInsert>: postgres/src/backend/executor/nodeHash.o
      2 (LSE instructions)          <__aarch64_cas8_acq_rel> <ExecParallelHashTableInsertCurrentBatch>: postgres/src/backend/executor/nodeHash.o
      2 (LSE instructions)          <__aarch64_cas8_acq_rel> <ExecParallelHashIncreaseNumBuckets>: postgres/src/backend/executor/nodeHash.o
      2 (LSE instructions)          <__aarch64_cas4_acq_rel> <TransactionIdSetTreeStatus>:  postgres/src/backend/access/transam/clog.o
      2 (LSE instructions)          <__aarch64_cas4_acq_rel> <ProcArrayEndTransaction>:     postgres/src/backend/storage/ipc/procarray.o
      2 (LSE instructions)          <__aarch64_cas4_acq_rel> <LWLockAcquireOrWait>:         postgres/src/backend/storage/lmgr/lwlock.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <XLogWalRcvFlush.part.4>:      postgres/src/backend/replication/walreceiver.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <XLogSetReplicationSlotMinimumLSN>: postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <XLogSetAsyncXactLSN>:         postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <XLogSendLogical>:             postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <XLogPageRead>:                postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <XLogNeedsFlush>:              postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <XLogGetLastRemovedSegno>:     postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <XLogFlush>:                   postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <worker_freeze_result_tape>:   postgres/src/backend/utils/sort/tuplesort.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <WalSndWakeup>:                postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <WalSndWaitStopping>:          postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <WalSndSetState>:              postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <WalSndRqstFileReload>:        postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <WalSndKill>:                  postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <WalSndInitStopping>:          postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <WalRcvForceReply>:            postgres/src/backend/replication/walreceiver.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <WaitXLogInsertionsToFinish>:  postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <UpdateMinRecoveryPoint.part.10>: postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <tuplesort_performsort>:       postgres/src/backend/utils/sort/tuplesort.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <tuplesort_begin_common>:      postgres/src/backend/utils/sort/tuplesort.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <table_block_parallelscan_startblock_init>: postgres/src/backend/access/table/tableam.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SyncRepInitConfig>:           postgres/src/backend/replication/syncrep.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SyncRepGetCandidateStandbys>: postgres/src/backend/replication/syncrep.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <StrategySyncStart>:           postgres/src/backend/storage/buffer/freelist.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <StrategyNotifyBgWriter>:      postgres/src/backend/storage/buffer/freelist.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <StrategyFreeBuffer>:          postgres/src/backend/storage/buffer/freelist.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SnapshotTooOldMagicForTest>:  postgres/src/backend/utils/time/snapmgr.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <s_lock>:                      postgres/src/backend/storage/lmgr/s_lock.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SIInsertDataEntries>:         postgres/src/backend/storage/ipc/sinvaladt.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SIGetDataEntries>:            postgres/src/backend/storage/ipc/sinvaladt.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ShutdownWalRcv>:              postgres/src/backend/replication/walreceiverfuncs.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_toc_insert>:              postgres/src/backend/storage/ipc/shm_toc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_toc_freespace>:           postgres/src/backend/storage/ipc/shm_toc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_toc_allocate>:            postgres/src/backend/storage/ipc/shm_toc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_mq_set_sender>:           postgres/src/backend/storage/ipc/shm_mq.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_mq_set_receiver>:         postgres/src/backend/storage/ipc/shm_mq.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_mq_sendv>:                postgres/src/backend/storage/ipc/shm_mq.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_mq_get_sender>:           postgres/src/backend/storage/ipc/shm_mq.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_mq_get_receiver>:         postgres/src/backend/storage/ipc/shm_mq.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <shm_mq_detach_internal>:      postgres/src/backend/storage/ipc/shm_mq.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ShmemAllocRaw>:               postgres/src/backend/storage/ipc/shmem.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SharedFileSetOnDetach>:       postgres/src/backend/storage/file/sharedfileset.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SharedFileSetAttach>:         postgres/src/backend/storage/file/sharedfileset.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SetWalWriterSleeping>:        postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SetRecoveryPause>:            postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SetPromoteIsTriggered>:       postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <SetOldSnapshotThresholdTimestamp>: postgres/src/backend/utils/time/snapmgr.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <RequestXLogStreaming>:        postgres/src/backend/replication/walreceiverfuncs.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotsDropDBSlots>: postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotsCountDBSlots>: postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotsComputeRequiredXmin>: postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotsComputeRequiredLSN>: postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotsComputeLogicalRestartLSN>: postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotPersist>:      postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotMarkDirty>:    postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotDropPtr>:      postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotCreate>:       postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotCleanup>:      postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReplicationSlotAcquireInternal>: postgres/src/backend/replication/slot.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <RemoveOldXlogFiles>:          postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <RemoveLocalLock>:             postgres/src/backend/storage/lmgr/lock.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <RecoveryRestartPoint>:        postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <RecoveryIsPaused>:            postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ReadRecord>:                  postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <PublishStartupProcessInformation>: postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <PromoteIsTriggered>:          postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ProcSendSignal>:              postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ProcessWalSndrMessage>:       postgres/src/backend/replication/walreceiver.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <PhysicalReplicationSlotNewXmin>: postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <pg_stat_get_wal_senders>:     postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <pg_stat_get_wal_receiver>:    postgres/src/backend/replication/walreceiver.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <pg_replication_slot_advance>: postgres/src/backend/replication/slotfuncs.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ParallelWorkerReportLastRecEnd>: postgres/src/backend/access/transam/parallel.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <MaintainOldSnapshotTimeMapping>: postgres/src/backend/utils/time/snapmgr.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <LWLockNewTrancheId>:          postgres/src/backend/storage/lmgr/lwlock.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <LogicalIncreaseXminForSlot>:  postgres/src/backend/replication/logical/logical.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <LogicalIncreaseRestartDecodingForSlot>: postgres/src/backend/replication/logical/logical.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <lock_twophase_recover>:       postgres/src/backend/storage/lmgr/lock.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <LockRefindAndRelease>:        postgres/src/backend/storage/lmgr/lock.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <LockAcquireExtended>:         postgres/src/backend/storage/lmgr/lock.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <KnownAssignedXidsSearch>:     postgres/src/backend/storage/ipc/procarray.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <KnownAssignedXidsGetAndSetXmin>: postgres/src/backend/storage/ipc/procarray.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <KnownAssignedXidsAdd>:        postgres/src/backend/storage/ipc/procarray.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <KeepLogSeg>:                  postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <InitWalSender>:               postgres/src/backend/replication/walsender.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <InitProcess>:                 postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <InitAuxiliaryProcess>:        postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <HotStandbyActive>:            postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <HaveNFreeProcs>:              postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetXLogWriteRecPtr>:          postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetXLogReplayRecPtr>:         postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetXLogInsertRecPtr>:         postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetWalRcvFlushRecPtr>:        postgres/src/backend/replication/walreceiverfuncs.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetSnapshotCurrentTimestamp>: postgres/src/backend/utils/time/snapmgr.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetReplicationTransferLatency>: postgres/src/backend/replication/walreceiverfuncs.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetReplicationApplyDelay>:    postgres/src/backend/replication/walreceiverfuncs.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetRedoRecPtr>:               postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetRecoveryState>:            postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetOldSnapshotThresholdTimestamp>: postgres/src/backend/utils/time/snapmgr.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetLatestXTime>:              postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetInsertRecPtr>:             postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetFlushRecPtr>:              postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetFakeLSNForUnloggedRel>:    postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <GetCurrentChunkReplayStartTime>: postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <FirstCallSinceLastCheckpoint>: postgres/src/backend/postmaster/checkpointer.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <element_alloc>:               postgres/src/backend/utils/hash/dynahash.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <do_pg_stop_backup>:           postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <do_pg_start_backup>:          postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <DecodingContextFindStartpoint>: postgres/src/backend/replication/logical/logical.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ConditionVariableTimedSleep>: postgres/src/backend/storage/lmgr/condition_variable.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ConditionVariableSignal>:     postgres/src/backend/storage/lmgr/condition_variable.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ConditionVariablePrepareToSleep>: postgres/src/backend/storage/lmgr/condition_variable.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ConditionVariableCancelSleep>: postgres/src/backend/storage/lmgr/condition_variable.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <ComputeXidHorizons>:          postgres/src/backend/storage/ipc/procarray.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <CheckXLogRemoved>:            postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <CheckRecoveryConsistency.part.11>: postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <_bt_parallel_seize>:          postgres/src/backend/access/nbtree/nbtree.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <_bt_parallel_scan_and_sort>:  postgres/src/backend/access/nbtree/nbtsort.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <btparallelrescan>:            postgres/src/backend/access/nbtree/nbtree.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <_bt_parallel_release>:        postgres/src/backend/access/nbtree/nbtree.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <_bt_parallel_done>:           postgres/src/backend/access/nbtree/nbtree.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <_bt_parallel_advance_array_keys>: postgres/src/backend/access/nbtree/nbtree.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <btbuild>:                     postgres/src/backend/access/nbtree/nbtsort.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <BarrierParticipants>:         postgres/src/backend/storage/ipc/barrier.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <BarrierDetach>:               postgres/src/backend/storage/ipc/barrier.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <BarrierAttach>:               postgres/src/backend/storage/ipc/barrier.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <BarrierArriveAndDetach>:      postgres/src/backend/storage/ipc/barrier.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <BarrierArriveAndDetachExceptLast>: postgres/src/backend/storage/ipc/barrier.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <AuxiliaryProcKill>:           postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <AdvanceXLInsertBuffer>:       postgres/src/backend/access/transam/xlog.o
      1 (LSE instructions)          <__aarch64_swp4_acq> <AbortStrongLockAcquire>:      postgres/src/backend/storage/lmgr/lock.o
      1 (LSE instructions)          <__aarch64_ldset4_acq_rel> <ProcessProcSignalBarrier>:    postgres/src/backend/storage/ipc/procsignal.o
      1 (LSE instructions)          <__aarch64_ldset4_acq_rel> <LWLockWaitForVar>:            postgres/src/backend/storage/lmgr/lwlock.o
      1 (LSE instructions)          <__aarch64_ldset4_acq_rel> <LWLockQueueSelf>:             postgres/src/backend/storage/lmgr/lwlock.o
      1 (LSE instructions)          <__aarch64_ldset4_acq_rel> <LWLockDequeueSelf>:           postgres/src/backend/storage/lmgr/lwlock.o
      1 (LSE instructions)          <__aarch64_ldset4_acq_rel> <LWLockAcquire>:               postgres/src/backend/storage/lmgr/lwlock.o
      1 (LSE instructions)          <__aarch64_ldset4_acq_rel> <LockBufHdr>:                  postgres/src/backend/storage/buffer/bufmgr.o
      1 (LSE instructions)          <__aarch64_ldset4_acq_rel> <EmitProcSignalBarrier>:       postgres/src/backend/storage/ipc/procsignal.o
      1 (LSE instructions)          <__aarch64_ldclr4_acq_rel> <LWLockReleaseClearVar>:       postgres/src/backend/storage/lmgr/lwlock.o
      1 (LSE instructions)          <__aarch64_ldadd8_acq_rel> <table_block_parallelscan_nextpage>: postgres/src/backend/access/table/tableam.o
      1 (LSE instructions)          <__aarch64_ldadd8_acq_rel> <EmitProcSignalBarrier>:       postgres/src/backend/storage/ipc/procsignal.o
      1 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <find_or_make_matching_shared_tupledesc>: postgres/src/backend/utils/cache/typcache.o
      1 (LSE instructions)          <__aarch64_ldadd4_acq_rel> <ExecParallelHashJoin>:        postgres/src/backend/executor/nodeHashjoin.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <table_block_parallelscan_reinitialize>: postgres/src/backend/access/table/tableam.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <ProcWakeup>:                  postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <ProcSleep>:                   postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <pg_stat_get_wal_receiver>:    postgres/src/backend/replication/walreceiver.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <InitProcess>:                 postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <InitAuxiliaryProcess>:        postgres/src/backend/storage/lmgr/proc.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <GetWalRcvWriteRecPtr>:        postgres/src/backend/replication/walreceiverfuncs.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <GetLockStatusData>:           postgres/src/backend/storage/lmgr/lock.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <ExecParallelScanHashBucket>:  postgres/src/backend/executor/nodeHash.o
      1 (LSE instructions)          <__aarch64_cas8_acq_rel> <CleanupProcSignalState>:      postgres/src/backend/storage/ipc/procsignal.o
      1 (LSE instructions)          <__aarch64_cas4_acq_rel> <UnpinBuffer.constprop.11>:    postgres/src/backend/storage/buffer/bufmgr.o
      1 (LSE instructions)          <__aarch64_cas4_acq_rel> <StrategySyncStart>:           postgres/src/backend/storage/buffer/freelist.o
      1 (LSE instructions)          <__aarch64_cas4_acq_rel> <StrategyGetBuffer>:           postgres/src/backend/storage/buffer/freelist.o
      1 (LSE instructions)          <__aarch64_cas4_acq_rel> <ProcessProcSignalBarrier>:    postgres/src/backend/storage/ipc/procsignal.o
      1 (LSE instructions)          <__aarch64_cas4_acq_rel> <PinBuffer>:                   postgres/src/backend/storage/buffer/bufmgr.o
      1 (LSE instructions)          <__aarch64_cas4_acq_rel> <MarkBufferDirty>:             postgres/src/backend/storage/buffer/bufmgr.o
      1 (LSE instructions)          <__aarch64_cas4_acq_rel> <LWLockRelease>:               postgres/src/backend/storage/lmgr/lwlock.o
      1 (LSE instructions)          <__aarch64_cas4_acq_rel> <LWLockConditionalAcquire>:    postgres/src/backend/storage/lmgr/lwlock.o

因此,这证实了它是使用 -marchu003darmv8-a 和大纲 -moutline-atomics 编译的(这是 GCC >u003d 10 和在 Amazon Linux 2 中编译的 GCC 7 中的默认值)。 LSE(大型系统扩展)在那里,我们可以看到原子指令的使用位置:WAL 和缓冲区轻量级锁,用于保护对共享内存的访问。

for i in /usr/local/pgsql/bin/postgres $(find postgres/src/backend -name "*.o") ; do objdump -d "$i" | awk '/:$/{w=$2}/aarch64_(cas|casp|swp|ldadd|stadd|ldclr|stclr|ldeor|steor|ldset|stset|ldsmax|stsmax|ldsmin|stsmin|ldumax|stumax|ldumin|stumin)/{printf "%-27s %-40s %-40s %-60s\n","(LSE instructions)",$NF,w,f}/\t(ldxr|ldaxr|stxr|stlxr)\t/{printf "%-27s %-40s %-40s %-60s\n","(load and store exclusives)",$3,w,f}' f="$i" ; done | sort | uniq -c | sort -rn

      1 (load and store exclusives) stxr                                     <__aarch64_swp4_acq>:                    /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) stlxr                                    <__aarch64_ldset4_acq_rel>:              /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) stlxr                                    <__aarch64_ldclr4_acq_rel>:              /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) stlxr                                    <__aarch64_ldadd8_acq_rel>:              /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) stlxr                                    <__aarch64_ldadd4_acq_rel>:              /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) stlxr                                    <__aarch64_cas8_acq_rel>:                /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) stlxr                                    <__aarch64_cas4_acq_rel>:                /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) ldaxr                                    <__aarch64_swp4_acq>:                    /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) ldaxr                                    <__aarch64_ldset4_acq_rel>:              /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) ldaxr                                    <__aarch64_ldclr4_acq_rel>:              /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) ldaxr                                    <__aarch64_ldadd8_acq_rel>:              /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) ldaxr                                    <__aarch64_ldadd4_acq_rel>:              /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) ldaxr                                    <__aarch64_cas8_acq_rel>:                /usr/local/pgsql/bin/postgres
      1 (load and store exclusives) ldaxr                                    <__aarch64_cas4_acq_rel>:                /usr/local/pgsql/bin/postgres

这证实了 PostgreSQL 二进制文件还包含加载和存储独占,以便二进制文件可以在 Graviton 和 Graviton2 上运行。


[ec2-user@ip-172-31-11-116 ~]$ nm /usr/local/pgsql/bin/postgres | grep -E "aarch64(_have_lse_atomics)?"

00000000008fb460 t __aarch64_cas4_acq_rel
00000000008fb490 t __aarch64_cas8_acq_rel
0000000000bbe640 b __aarch64_have_lse_atomics
00000000008fb4f0 t __aarch64_ldadd4_acq_rel
00000000008fb580 t __aarch64_ldadd8_acq_rel
00000000008fb520 t __aarch64_ldclr4_acq_rel
00000000008fb550 t __aarch64_ldset4_acq_rel
00000000008fb4c0 t __aarch64_swp4_acq

这是运行时检测。由于它是为 ARM v8 编译的,并且概述了原子,相同的二进制文件可以在 v8 或 >u003dv8.1 上运行


[ec2-user@ip-172-31-11-116 ~]$ gcc --version
gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12)
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

这是 GCC 7,但在 Amazon Linux 2 上,它已被修补以默认启用 -moutline-atomics。

安装最新版本的GCC(11版实验性)

以下是我如何编译可用的最新 GCC:


gcc --version
sudo yum -y install bzip2 git gcc gcc-c++ gmp-devel mpfr-devel libmpc-devel make flex bison
git clone https://github.com/gcc-mirror/gcc.git
cd gcc
make distclean
./configure --enable-languages=c,c++
make
sudo make install

这基本上是从源代码获取最新的 GCC,编译并安装它(请记住这是一个实验室 - 在其他地方使用稳定版本)

[ec2-user@ip-172-31-38-254 ~]$ gcc --version
gcc (GCC) 11.0.1 20210309 (experimental)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

我们在这里:gcc 11.0.120210309(实验性)

PGIO LIOPS

我正在运行与上一篇文章中相同的 PGIO


Date: Wed Mar 10 14:39:38 UTC 2021
Database connect string: "pgio".
Shared buffers: 8500MB.
Testing 4 schemas with 1 thread(s) accessing 1024M (131072 blocks) of each schema.
Running iostat, vmstat and mpstat on current host--in background.
Launching sessions. 4 schema(s) will be accessed by 1 thread(s) each.
pg_stat_database stats:
          datname| blks_hit| blks_read|tup_returned|tup_fetched|tup_updated
BEFORE:  pgio    | 38262338086 |    562443 |  37644815538 | 37635763756 |          24
AFTER:   pgio    | 49691750429 |    562449 |  48890461241 | 48878858651 |          49
DBNAME:  pgio. 4 schemas, 1 threads(each). Run time: 3600 seconds. RIOPS >793709<

这比我所拥有的要高一点:793709LIOPS / CPU,我在 GCC 7 上拥有780651,但这仍然低于我在 x86 上拥有的896280。

当然,可以有更多的优化,如https://github.com/aws/aws-graviton-getting-started/blob/master/c-c++.md

我将使用推荐的标志重新编译

(
cd postgres
CFLAGS="-march=armv8.2-a+fp16+rcpc+dotprod+crypto -mtune=neoverse-n1 -fsigned-char" ./configure
make clean
make
make install
)

我对 PGIO 运行没有任何影响。当然,这可能会随着带有校验和的读写工作负载(更多自旋锁)而改变。

请注意,我使用默认(空)CFLAGS 编译,然后使用 -marchu003darmv8-a+crc 调用 gcc(并且 -moutline-atomics 是默认值),所以我在运行时检测中处于相同的情况。因为 GCC >u003d10 行为已被 Amazon 支持到 Amazon Linux 2 中的 GCC 7。这最初对我来说并不清楚(我在这里得到了澄清的)。

顺便说一句,Graviton2 上的 Aurora 还是用 GCC 7.4 编译的

2021 年 5 月 15 日更新:我在这里改写了一些不清楚的内容(即使对我自己来说也是如此),但我会在 ARM 上的 PostgreSQL 和一般基准上写更多内容。http://blog.pachot.net应该发送到正确的地方(或者@FranckPachot当然是推特)

Logo

PostgreSQL社区为您提供最前沿的新闻资讯和知识内容

更多推荐