AWS:具有较新 GCC 的 Graviton2 上的 PostgreSQL
在之前的帖子我已经在 AWS m6gd.2xlarge(ARM Graviton2 处理器)上运行 PostgreSQL。
我没有精确编译选项,这篇文章将在此反馈之后提供更多详细信息:
https://twitter.com/N_B__N_B/status/136918884608315398
首先,PostgreSQL ./configure 已正确检测到 ARM 并使用以下标志编译:-marchu003darmv8-a+crc
这是 ARM v8。但是,用于原子指令的 LSE(大型系统扩展)后来在 ARM v8.1 中添加,它们可以在 PostgreSQL 上产生巨大的差异,尤其是在 CPU 使用率高的情况下使用自旋锁。
我按照https://github.com/aws/aws-graviton-getting-started/blob/master/c-c++.md中的信息检查编译后的二进制文件。
for i in $(find postgres/src/backend -name "*.o") ; do objdump -d "$i" | awk '/:$/{w=$2}/aarch64_(cas|casp|swp|ldadd|stadd|ldclr|stclr|ldeor|steor|ldset|stset|ldsmax|stsmax|ldsmin|stsmin|ldumax|stumax|ldumin|stumin)/{printf "%-27s %-20s %-30s %-60s\n","(LSE instructions)",$NF,w,f}' f="$i" ; done | sort | uniq -c | sort -rnk1,4
8 (LSE instructions) <__aarch64_swp4_acq> <StartupXLOG>: postgres/src/backend/access/transam/xlog.o
7 (LSE instructions) <__aarch64_swp4_acq> <BitmapHeapNext>: postgres/src/backend/executor/nodeBitmapHeapscan.o
6 (LSE instructions) <__aarch64_ldclr4_acq_rel> <LWLockDequeueSelf>: postgres/src/backend/storage/lmgr/lwlock.o
6 (LSE instructions) <__aarch64_cas8_acq_rel> <shm_mq_send_bytes>: postgres/src/backend/storage/ipc/shm_mq.o
5 (LSE instructions) <__aarch64_swp4_acq> <WalReceiverMain>: postgres/src/backend/replication/walreceiver.o
5 (LSE instructions) <__aarch64_cas8_acq_rel> <shm_mq_receive_bytes.isra.0>: postgres/src/backend/storage/ipc/shm_mq.o
4 (LSE instructions) <__aarch64_swp4_acq> <ProcessRepliesIfAny>: postgres/src/backend/replication/walsender.o
4 (LSE instructions) <__aarch64_swp4_acq> <hash_search_with_hash_value>: postgres/src/backend/utils/hash/dynahash.o
4 (LSE instructions) <__aarch64_swp4_acq> <copy_replication_slot>: postgres/src/backend/replication/slotfuncs.o
4 (LSE instructions) <__aarch64_ldadd4_acq_rel> <parallel_vacuum_index>: postgres/src/backend/access/heap/vacuumlazy.o
4 (LSE instructions) <__aarch64_cas4_acq_rel> <LWLockAcquire>: postgres/src/backend/storage/lmgr/lwlock.o
3 (LSE instructions) <__aarch64_swp4_acq> <xlog_redo>: postgres/src/backend/access/transam/xlog.o
3 (LSE instructions) <__aarch64_swp4_acq> <XLogInsertRecord>: postgres/src/backend/access/transam/xlog.o
3 (LSE instructions) <__aarch64_swp4_acq> <SaveSlotToPath>: postgres/src/backend/replication/slot.o
3 (LSE instructions) <__aarch64_swp4_acq> <RequestCheckpoint>: postgres/src/backend/postmaster/checkpointer.o
3 (LSE instructions) <__aarch64_swp4_acq> <LogicalRepSyncTableStart>: postgres/src/backend/replication/logical/tablesync.o
3 (LSE instructions) <__aarch64_swp4_acq> <LogicalConfirmReceivedLocation>: postgres/src/backend/replication/logical/logical.o
3 (LSE instructions) <__aarch64_swp4_acq> <InvalidateObsoleteReplicationSlots>: postgres/src/backend/replication/slot.o
3 (LSE instructions) <__aarch64_swp4_acq> <CreateInitDecodingContext>: postgres/src/backend/replication/logical/logical.o
3 (LSE instructions) <__aarch64_swp4_acq> <CreateCheckPoint>: postgres/src/backend/access/transam/xlog.o
3 (LSE instructions) <__aarch64_swp4_acq> <CheckpointerMain>: postgres/src/backend/postmaster/checkpointer.o
3 (LSE instructions) <__aarch64_ldclr4_acq_rel> <LWLockQueueSelf>: postgres/src/backend/storage/lmgr/lwlock.o
3 (LSE instructions) <__aarch64_ldadd4_acq_rel> <tbm_prepare_shared_iterate>: postgres/src/backend/nodes/tidbitmap.o
3 (LSE instructions) <__aarch64_ldadd4_acq_rel> <tbm_free_shared_area>: postgres/src/backend/nodes/tidbitmap.o
3 (LSE instructions) <__aarch64_cas8_acq_rel> <ProcessProcSignalBarrier>: postgres/src/backend/storage/ipc/procsignal.o
3 (LSE instructions) <__aarch64_cas8_acq_rel> <ExecParallelHashIncreaseNumBatches>: postgres/src/backend/executor/nodeHash.o
2 (LSE instructions) <__aarch64_swp4_acq> <XLogWrite>: postgres/src/backend/access/transam/xlog.o
2 (LSE instructions) <__aarch64_swp4_acq> <XLogSendPhysical>: postgres/src/backend/replication/walsender.o
2 (LSE instructions) <__aarch64_swp4_acq> <XLogBackgroundFlush>: postgres/src/backend/access/transam/xlog.o
2 (LSE instructions) <__aarch64_swp4_acq> <WalRcvStreaming>: postgres/src/backend/replication/walreceiverfuncs.o
2 (LSE instructions) <__aarch64_swp4_acq> <WalRcvRunning>: postgres/src/backend/replication/walreceiverfuncs.o
2 (LSE instructions) <__aarch64_swp4_acq> <WalRcvDie>: postgres/src/backend/replication/walreceiver.o
2 (LSE instructions) <__aarch64_swp4_acq> <TransactionIdLimitedForOldSnapshots>: postgres/src/backend/utils/time/snapmgr.o
2 (LSE instructions) <__aarch64_swp4_acq> <StrategyGetBuffer>: postgres/src/backend/storage/buffer/freelist.o
2 (LSE instructions) <__aarch64_swp4_acq> <shm_mq_wait_internal>: postgres/src/backend/storage/ipc/shm_mq.o
2 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotReserveWal>: postgres/src/backend/replication/slot.o
2 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotRelease>: postgres/src/backend/replication/slot.o
2 (LSE instructions) <__aarch64_swp4_acq> <ProcKill>: postgres/src/backend/storage/lmgr/proc.o
2 (LSE instructions) <__aarch64_swp4_acq> <process_syncing_tables>: postgres/src/backend/replication/logical/tablesync.o
2 (LSE instructions) <__aarch64_swp4_acq> <pg_get_replication_slots>: postgres/src/backend/replication/slotfuncs.o
2 (LSE instructions) <__aarch64_swp4_acq> <exec_replication_command>: postgres/src/backend/replication/walsender.o
2 (LSE instructions) <__aarch64_swp4_acq> <CreateRestartPoint>: postgres/src/backend/access/transam/xlog.o
2 (LSE instructions) <__aarch64_swp4_acq> <ConditionVariableBroadcast>: postgres/src/backend/storage/lmgr/condition_variable.o
2 (LSE instructions) <__aarch64_swp4_acq> <BarrierArriveAndWait>: postgres/src/backend/storage/ipc/barrier.o
2 (LSE instructions) <__aarch64_ldset4_acq_rel> <LWLockWaitListLock>: postgres/src/backend/storage/lmgr/lwlock.o
2 (LSE instructions) <__aarch64_ldclr4_acq_rel> <LWLockWaitForVar>: postgres/src/backend/storage/lmgr/lwlock.o
2 (LSE instructions) <__aarch64_ldclr4_acq_rel> <LWLockUpdateVar>: postgres/src/backend/storage/lmgr/lwlock.o
2 (LSE instructions) <__aarch64_ldadd4_acq_rel> <vacuum_delay_point>: postgres/src/backend/commands/vacuum.o
2 (LSE instructions) <__aarch64_ldadd4_acq_rel> <StrategyGetBuffer>: postgres/src/backend/storage/buffer/freelist.o
2 (LSE instructions) <__aarch64_ldadd4_acq_rel> <LWLockRelease>: postgres/src/backend/storage/lmgr/lwlock.o
2 (LSE instructions) <__aarch64_ldadd4_acq_rel> <lazy_parallel_vacuum_indexes>: postgres/src/backend/access/heap/vacuumlazy.o
2 (LSE instructions) <__aarch64_cas8_acq_rel> <WalReceiverMain>: postgres/src/backend/replication/walreceiver.o
2 (LSE instructions) <__aarch64_cas8_acq_rel> <WaitForProcSignalBarrier>: postgres/src/backend/storage/ipc/procsignal.o
2 (LSE instructions) <__aarch64_cas8_acq_rel> <shm_mq_receive>: postgres/src/backend/storage/ipc/shm_mq.o
2 (LSE instructions) <__aarch64_cas8_acq_rel> <ResolveRecoveryConflictWithLock>: postgres/src/backend/storage/ipc/standby.o
2 (LSE instructions) <__aarch64_cas8_acq_rel> <ProcSignalInit>: postgres/src/backend/storage/ipc/procsignal.o
2 (LSE instructions) <__aarch64_cas8_acq_rel> <ExecParallelHashTableInsert>: postgres/src/backend/executor/nodeHash.o
2 (LSE instructions) <__aarch64_cas8_acq_rel> <ExecParallelHashTableInsertCurrentBatch>: postgres/src/backend/executor/nodeHash.o
2 (LSE instructions) <__aarch64_cas8_acq_rel> <ExecParallelHashIncreaseNumBuckets>: postgres/src/backend/executor/nodeHash.o
2 (LSE instructions) <__aarch64_cas4_acq_rel> <TransactionIdSetTreeStatus>: postgres/src/backend/access/transam/clog.o
2 (LSE instructions) <__aarch64_cas4_acq_rel> <ProcArrayEndTransaction>: postgres/src/backend/storage/ipc/procarray.o
2 (LSE instructions) <__aarch64_cas4_acq_rel> <LWLockAcquireOrWait>: postgres/src/backend/storage/lmgr/lwlock.o
1 (LSE instructions) <__aarch64_swp4_acq> <XLogWalRcvFlush.part.4>: postgres/src/backend/replication/walreceiver.o
1 (LSE instructions) <__aarch64_swp4_acq> <XLogSetReplicationSlotMinimumLSN>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <XLogSetAsyncXactLSN>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <XLogSendLogical>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <XLogPageRead>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <XLogNeedsFlush>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <XLogGetLastRemovedSegno>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <XLogFlush>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <worker_freeze_result_tape>: postgres/src/backend/utils/sort/tuplesort.o
1 (LSE instructions) <__aarch64_swp4_acq> <WalSndWakeup>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <WalSndWaitStopping>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <WalSndSetState>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <WalSndRqstFileReload>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <WalSndKill>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <WalSndInitStopping>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <WalRcvForceReply>: postgres/src/backend/replication/walreceiver.o
1 (LSE instructions) <__aarch64_swp4_acq> <WaitXLogInsertionsToFinish>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <UpdateMinRecoveryPoint.part.10>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <tuplesort_performsort>: postgres/src/backend/utils/sort/tuplesort.o
1 (LSE instructions) <__aarch64_swp4_acq> <tuplesort_begin_common>: postgres/src/backend/utils/sort/tuplesort.o
1 (LSE instructions) <__aarch64_swp4_acq> <table_block_parallelscan_startblock_init>: postgres/src/backend/access/table/tableam.o
1 (LSE instructions) <__aarch64_swp4_acq> <SyncRepInitConfig>: postgres/src/backend/replication/syncrep.o
1 (LSE instructions) <__aarch64_swp4_acq> <SyncRepGetCandidateStandbys>: postgres/src/backend/replication/syncrep.o
1 (LSE instructions) <__aarch64_swp4_acq> <StrategySyncStart>: postgres/src/backend/storage/buffer/freelist.o
1 (LSE instructions) <__aarch64_swp4_acq> <StrategyNotifyBgWriter>: postgres/src/backend/storage/buffer/freelist.o
1 (LSE instructions) <__aarch64_swp4_acq> <StrategyFreeBuffer>: postgres/src/backend/storage/buffer/freelist.o
1 (LSE instructions) <__aarch64_swp4_acq> <SnapshotTooOldMagicForTest>: postgres/src/backend/utils/time/snapmgr.o
1 (LSE instructions) <__aarch64_swp4_acq> <s_lock>: postgres/src/backend/storage/lmgr/s_lock.o
1 (LSE instructions) <__aarch64_swp4_acq> <SIInsertDataEntries>: postgres/src/backend/storage/ipc/sinvaladt.o
1 (LSE instructions) <__aarch64_swp4_acq> <SIGetDataEntries>: postgres/src/backend/storage/ipc/sinvaladt.o
1 (LSE instructions) <__aarch64_swp4_acq> <ShutdownWalRcv>: postgres/src/backend/replication/walreceiverfuncs.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_toc_insert>: postgres/src/backend/storage/ipc/shm_toc.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_toc_freespace>: postgres/src/backend/storage/ipc/shm_toc.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_toc_allocate>: postgres/src/backend/storage/ipc/shm_toc.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_mq_set_sender>: postgres/src/backend/storage/ipc/shm_mq.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_mq_set_receiver>: postgres/src/backend/storage/ipc/shm_mq.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_mq_sendv>: postgres/src/backend/storage/ipc/shm_mq.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_mq_get_sender>: postgres/src/backend/storage/ipc/shm_mq.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_mq_get_receiver>: postgres/src/backend/storage/ipc/shm_mq.o
1 (LSE instructions) <__aarch64_swp4_acq> <shm_mq_detach_internal>: postgres/src/backend/storage/ipc/shm_mq.o
1 (LSE instructions) <__aarch64_swp4_acq> <ShmemAllocRaw>: postgres/src/backend/storage/ipc/shmem.o
1 (LSE instructions) <__aarch64_swp4_acq> <SharedFileSetOnDetach>: postgres/src/backend/storage/file/sharedfileset.o
1 (LSE instructions) <__aarch64_swp4_acq> <SharedFileSetAttach>: postgres/src/backend/storage/file/sharedfileset.o
1 (LSE instructions) <__aarch64_swp4_acq> <SetWalWriterSleeping>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <SetRecoveryPause>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <SetPromoteIsTriggered>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <SetOldSnapshotThresholdTimestamp>: postgres/src/backend/utils/time/snapmgr.o
1 (LSE instructions) <__aarch64_swp4_acq> <RequestXLogStreaming>: postgres/src/backend/replication/walreceiverfuncs.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotsDropDBSlots>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotsCountDBSlots>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotsComputeRequiredXmin>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotsComputeRequiredLSN>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotsComputeLogicalRestartLSN>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotPersist>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotMarkDirty>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotDropPtr>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotCreate>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotCleanup>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReplicationSlotAcquireInternal>: postgres/src/backend/replication/slot.o
1 (LSE instructions) <__aarch64_swp4_acq> <RemoveOldXlogFiles>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <RemoveLocalLock>: postgres/src/backend/storage/lmgr/lock.o
1 (LSE instructions) <__aarch64_swp4_acq> <RecoveryRestartPoint>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <RecoveryIsPaused>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <ReadRecord>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <PublishStartupProcessInformation>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_swp4_acq> <PromoteIsTriggered>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <ProcSendSignal>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_swp4_acq> <ProcessWalSndrMessage>: postgres/src/backend/replication/walreceiver.o
1 (LSE instructions) <__aarch64_swp4_acq> <PhysicalReplicationSlotNewXmin>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <pg_stat_get_wal_senders>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <pg_stat_get_wal_receiver>: postgres/src/backend/replication/walreceiver.o
1 (LSE instructions) <__aarch64_swp4_acq> <pg_replication_slot_advance>: postgres/src/backend/replication/slotfuncs.o
1 (LSE instructions) <__aarch64_swp4_acq> <ParallelWorkerReportLastRecEnd>: postgres/src/backend/access/transam/parallel.o
1 (LSE instructions) <__aarch64_swp4_acq> <MaintainOldSnapshotTimeMapping>: postgres/src/backend/utils/time/snapmgr.o
1 (LSE instructions) <__aarch64_swp4_acq> <LWLockNewTrancheId>: postgres/src/backend/storage/lmgr/lwlock.o
1 (LSE instructions) <__aarch64_swp4_acq> <LogicalIncreaseXminForSlot>: postgres/src/backend/replication/logical/logical.o
1 (LSE instructions) <__aarch64_swp4_acq> <LogicalIncreaseRestartDecodingForSlot>: postgres/src/backend/replication/logical/logical.o
1 (LSE instructions) <__aarch64_swp4_acq> <lock_twophase_recover>: postgres/src/backend/storage/lmgr/lock.o
1 (LSE instructions) <__aarch64_swp4_acq> <LockRefindAndRelease>: postgres/src/backend/storage/lmgr/lock.o
1 (LSE instructions) <__aarch64_swp4_acq> <LockAcquireExtended>: postgres/src/backend/storage/lmgr/lock.o
1 (LSE instructions) <__aarch64_swp4_acq> <KnownAssignedXidsSearch>: postgres/src/backend/storage/ipc/procarray.o
1 (LSE instructions) <__aarch64_swp4_acq> <KnownAssignedXidsGetAndSetXmin>: postgres/src/backend/storage/ipc/procarray.o
1 (LSE instructions) <__aarch64_swp4_acq> <KnownAssignedXidsAdd>: postgres/src/backend/storage/ipc/procarray.o
1 (LSE instructions) <__aarch64_swp4_acq> <KeepLogSeg>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <InitWalSender>: postgres/src/backend/replication/walsender.o
1 (LSE instructions) <__aarch64_swp4_acq> <InitProcess>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_swp4_acq> <InitAuxiliaryProcess>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_swp4_acq> <HotStandbyActive>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <HaveNFreeProcs>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetXLogWriteRecPtr>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetXLogReplayRecPtr>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetXLogInsertRecPtr>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetWalRcvFlushRecPtr>: postgres/src/backend/replication/walreceiverfuncs.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetSnapshotCurrentTimestamp>: postgres/src/backend/utils/time/snapmgr.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetReplicationTransferLatency>: postgres/src/backend/replication/walreceiverfuncs.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetReplicationApplyDelay>: postgres/src/backend/replication/walreceiverfuncs.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetRedoRecPtr>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetRecoveryState>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetOldSnapshotThresholdTimestamp>: postgres/src/backend/utils/time/snapmgr.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetLatestXTime>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetInsertRecPtr>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetFlushRecPtr>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetFakeLSNForUnloggedRel>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <GetCurrentChunkReplayStartTime>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <FirstCallSinceLastCheckpoint>: postgres/src/backend/postmaster/checkpointer.o
1 (LSE instructions) <__aarch64_swp4_acq> <element_alloc>: postgres/src/backend/utils/hash/dynahash.o
1 (LSE instructions) <__aarch64_swp4_acq> <do_pg_stop_backup>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <do_pg_start_backup>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <DecodingContextFindStartpoint>: postgres/src/backend/replication/logical/logical.o
1 (LSE instructions) <__aarch64_swp4_acq> <ConditionVariableTimedSleep>: postgres/src/backend/storage/lmgr/condition_variable.o
1 (LSE instructions) <__aarch64_swp4_acq> <ConditionVariableSignal>: postgres/src/backend/storage/lmgr/condition_variable.o
1 (LSE instructions) <__aarch64_swp4_acq> <ConditionVariablePrepareToSleep>: postgres/src/backend/storage/lmgr/condition_variable.o
1 (LSE instructions) <__aarch64_swp4_acq> <ConditionVariableCancelSleep>: postgres/src/backend/storage/lmgr/condition_variable.o
1 (LSE instructions) <__aarch64_swp4_acq> <ComputeXidHorizons>: postgres/src/backend/storage/ipc/procarray.o
1 (LSE instructions) <__aarch64_swp4_acq> <CheckXLogRemoved>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <CheckRecoveryConsistency.part.11>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <_bt_parallel_seize>: postgres/src/backend/access/nbtree/nbtree.o
1 (LSE instructions) <__aarch64_swp4_acq> <_bt_parallel_scan_and_sort>: postgres/src/backend/access/nbtree/nbtsort.o
1 (LSE instructions) <__aarch64_swp4_acq> <btparallelrescan>: postgres/src/backend/access/nbtree/nbtree.o
1 (LSE instructions) <__aarch64_swp4_acq> <_bt_parallel_release>: postgres/src/backend/access/nbtree/nbtree.o
1 (LSE instructions) <__aarch64_swp4_acq> <_bt_parallel_done>: postgres/src/backend/access/nbtree/nbtree.o
1 (LSE instructions) <__aarch64_swp4_acq> <_bt_parallel_advance_array_keys>: postgres/src/backend/access/nbtree/nbtree.o
1 (LSE instructions) <__aarch64_swp4_acq> <btbuild>: postgres/src/backend/access/nbtree/nbtsort.o
1 (LSE instructions) <__aarch64_swp4_acq> <BarrierParticipants>: postgres/src/backend/storage/ipc/barrier.o
1 (LSE instructions) <__aarch64_swp4_acq> <BarrierDetach>: postgres/src/backend/storage/ipc/barrier.o
1 (LSE instructions) <__aarch64_swp4_acq> <BarrierAttach>: postgres/src/backend/storage/ipc/barrier.o
1 (LSE instructions) <__aarch64_swp4_acq> <BarrierArriveAndDetach>: postgres/src/backend/storage/ipc/barrier.o
1 (LSE instructions) <__aarch64_swp4_acq> <BarrierArriveAndDetachExceptLast>: postgres/src/backend/storage/ipc/barrier.o
1 (LSE instructions) <__aarch64_swp4_acq> <AuxiliaryProcKill>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_swp4_acq> <AdvanceXLInsertBuffer>: postgres/src/backend/access/transam/xlog.o
1 (LSE instructions) <__aarch64_swp4_acq> <AbortStrongLockAcquire>: postgres/src/backend/storage/lmgr/lock.o
1 (LSE instructions) <__aarch64_ldset4_acq_rel> <ProcessProcSignalBarrier>: postgres/src/backend/storage/ipc/procsignal.o
1 (LSE instructions) <__aarch64_ldset4_acq_rel> <LWLockWaitForVar>: postgres/src/backend/storage/lmgr/lwlock.o
1 (LSE instructions) <__aarch64_ldset4_acq_rel> <LWLockQueueSelf>: postgres/src/backend/storage/lmgr/lwlock.o
1 (LSE instructions) <__aarch64_ldset4_acq_rel> <LWLockDequeueSelf>: postgres/src/backend/storage/lmgr/lwlock.o
1 (LSE instructions) <__aarch64_ldset4_acq_rel> <LWLockAcquire>: postgres/src/backend/storage/lmgr/lwlock.o
1 (LSE instructions) <__aarch64_ldset4_acq_rel> <LockBufHdr>: postgres/src/backend/storage/buffer/bufmgr.o
1 (LSE instructions) <__aarch64_ldset4_acq_rel> <EmitProcSignalBarrier>: postgres/src/backend/storage/ipc/procsignal.o
1 (LSE instructions) <__aarch64_ldclr4_acq_rel> <LWLockReleaseClearVar>: postgres/src/backend/storage/lmgr/lwlock.o
1 (LSE instructions) <__aarch64_ldadd8_acq_rel> <table_block_parallelscan_nextpage>: postgres/src/backend/access/table/tableam.o
1 (LSE instructions) <__aarch64_ldadd8_acq_rel> <EmitProcSignalBarrier>: postgres/src/backend/storage/ipc/procsignal.o
1 (LSE instructions) <__aarch64_ldadd4_acq_rel> <find_or_make_matching_shared_tupledesc>: postgres/src/backend/utils/cache/typcache.o
1 (LSE instructions) <__aarch64_ldadd4_acq_rel> <ExecParallelHashJoin>: postgres/src/backend/executor/nodeHashjoin.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <table_block_parallelscan_reinitialize>: postgres/src/backend/access/table/tableam.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <ProcWakeup>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <ProcSleep>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <pg_stat_get_wal_receiver>: postgres/src/backend/replication/walreceiver.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <InitProcess>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <InitAuxiliaryProcess>: postgres/src/backend/storage/lmgr/proc.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <GetWalRcvWriteRecPtr>: postgres/src/backend/replication/walreceiverfuncs.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <GetLockStatusData>: postgres/src/backend/storage/lmgr/lock.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <ExecParallelScanHashBucket>: postgres/src/backend/executor/nodeHash.o
1 (LSE instructions) <__aarch64_cas8_acq_rel> <CleanupProcSignalState>: postgres/src/backend/storage/ipc/procsignal.o
1 (LSE instructions) <__aarch64_cas4_acq_rel> <UnpinBuffer.constprop.11>: postgres/src/backend/storage/buffer/bufmgr.o
1 (LSE instructions) <__aarch64_cas4_acq_rel> <StrategySyncStart>: postgres/src/backend/storage/buffer/freelist.o
1 (LSE instructions) <__aarch64_cas4_acq_rel> <StrategyGetBuffer>: postgres/src/backend/storage/buffer/freelist.o
1 (LSE instructions) <__aarch64_cas4_acq_rel> <ProcessProcSignalBarrier>: postgres/src/backend/storage/ipc/procsignal.o
1 (LSE instructions) <__aarch64_cas4_acq_rel> <PinBuffer>: postgres/src/backend/storage/buffer/bufmgr.o
1 (LSE instructions) <__aarch64_cas4_acq_rel> <MarkBufferDirty>: postgres/src/backend/storage/buffer/bufmgr.o
1 (LSE instructions) <__aarch64_cas4_acq_rel> <LWLockRelease>: postgres/src/backend/storage/lmgr/lwlock.o
1 (LSE instructions) <__aarch64_cas4_acq_rel> <LWLockConditionalAcquire>: postgres/src/backend/storage/lmgr/lwlock.o
因此,这证实了它是使用 -marchu003darmv8-a 和大纲 -moutline-atomics 编译的(这是 GCC >u003d 10 和在 Amazon Linux 2 中编译的 GCC 7 中的默认值)。 LSE(大型系统扩展)在那里,我们可以看到原子指令的使用位置:WAL 和缓冲区轻量级锁,用于保护对共享内存的访问。
for i in /usr/local/pgsql/bin/postgres $(find postgres/src/backend -name "*.o") ; do objdump -d "$i" | awk '/:$/{w=$2}/aarch64_(cas|casp|swp|ldadd|stadd|ldclr|stclr|ldeor|steor|ldset|stset|ldsmax|stsmax|ldsmin|stsmin|ldumax|stumax|ldumin|stumin)/{printf "%-27s %-40s %-40s %-60s\n","(LSE instructions)",$NF,w,f}/\t(ldxr|ldaxr|stxr|stlxr)\t/{printf "%-27s %-40s %-40s %-60s\n","(load and store exclusives)",$3,w,f}' f="$i" ; done | sort | uniq -c | sort -rn
1 (load and store exclusives) stxr <__aarch64_swp4_acq>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) stlxr <__aarch64_ldset4_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) stlxr <__aarch64_ldclr4_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) stlxr <__aarch64_ldadd8_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) stlxr <__aarch64_ldadd4_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) stlxr <__aarch64_cas8_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) stlxr <__aarch64_cas4_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) ldaxr <__aarch64_swp4_acq>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) ldaxr <__aarch64_ldset4_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) ldaxr <__aarch64_ldclr4_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) ldaxr <__aarch64_ldadd8_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) ldaxr <__aarch64_ldadd4_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) ldaxr <__aarch64_cas8_acq_rel>: /usr/local/pgsql/bin/postgres
1 (load and store exclusives) ldaxr <__aarch64_cas4_acq_rel>: /usr/local/pgsql/bin/postgres
这证实了 PostgreSQL 二进制文件还包含加载和存储独占,以便二进制文件可以在 Graviton 和 Graviton2 上运行。
[ec2-user@ip-172-31-11-116 ~]$ nm /usr/local/pgsql/bin/postgres | grep -E "aarch64(_have_lse_atomics)?"
00000000008fb460 t __aarch64_cas4_acq_rel
00000000008fb490 t __aarch64_cas8_acq_rel
0000000000bbe640 b __aarch64_have_lse_atomics
00000000008fb4f0 t __aarch64_ldadd4_acq_rel
00000000008fb580 t __aarch64_ldadd8_acq_rel
00000000008fb520 t __aarch64_ldclr4_acq_rel
00000000008fb550 t __aarch64_ldset4_acq_rel
00000000008fb4c0 t __aarch64_swp4_acq
这是运行时检测。由于它是为 ARM v8 编译的,并且概述了原子,相同的二进制文件可以在 v8 或 >u003dv8.1 上运行
[ec2-user@ip-172-31-11-116 ~]$ gcc --version
gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12)
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
这是 GCC 7,但在 Amazon Linux 2 上,它已被修补以默认启用 -moutline-atomics。
安装最新版本的GCC(11版实验性)
以下是我如何编译可用的最新 GCC:
gcc --version
sudo yum -y install bzip2 git gcc gcc-c++ gmp-devel mpfr-devel libmpc-devel make flex bison
git clone https://github.com/gcc-mirror/gcc.git
cd gcc
make distclean
./configure --enable-languages=c,c++
make
sudo make install
这基本上是从源代码获取最新的 GCC,编译并安装它(请记住这是一个实验室 - 在其他地方使用稳定版本)
[ec2-user@ip-172-31-38-254 ~]$ gcc --version
gcc (GCC) 11.0.1 20210309 (experimental)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
我们在这里:gcc 11.0.120210309(实验性)
PGIO LIOPS
我正在运行与上一篇文章中相同的 PGIO
Date: Wed Mar 10 14:39:38 UTC 2021
Database connect string: "pgio".
Shared buffers: 8500MB.
Testing 4 schemas with 1 thread(s) accessing 1024M (131072 blocks) of each schema.
Running iostat, vmstat and mpstat on current host--in background.
Launching sessions. 4 schema(s) will be accessed by 1 thread(s) each.
pg_stat_database stats:
datname| blks_hit| blks_read|tup_returned|tup_fetched|tup_updated
BEFORE: pgio | 38262338086 | 562443 | 37644815538 | 37635763756 | 24
AFTER: pgio | 49691750429 | 562449 | 48890461241 | 48878858651 | 49
DBNAME: pgio. 4 schemas, 1 threads(each). Run time: 3600 seconds. RIOPS >793709<
这比我所拥有的要高一点:793709LIOPS / CPU,我在 GCC 7 上拥有780651,但这仍然低于我在 x86 上拥有的896280。
当然,可以有更多的优化,如https://github.com/aws/aws-graviton-getting-started/blob/master/c-c++.md
我将使用推荐的标志重新编译
(
cd postgres
CFLAGS="-march=armv8.2-a+fp16+rcpc+dotprod+crypto -mtune=neoverse-n1 -fsigned-char" ./configure
make clean
make
make install
)
我对 PGIO 运行没有任何影响。当然,这可能会随着带有校验和的读写工作负载(更多自旋锁)而改变。
请注意,我使用默认(空)CFLAGS 编译,然后使用 -marchu003darmv8-a+crc 调用 gcc(并且 -moutline-atomics 是默认值),所以我在运行时检测中处于相同的情况。因为 GCC >u003d10 行为已被 Amazon 支持到 Amazon Linux 2 中的 GCC 7。这最初对我来说并不清楚(我在这里得到了澄清的)。
顺便说一句,Graviton2 上的 Aurora 还是用 GCC 7.4 编译的

2021 年 5 月 15 日更新:我在这里改写了一些不清楚的内容(即使对我自己来说也是如此),但我会在 ARM 上的 PostgreSQL 和一般基准上写更多内容。http://blog.pachot.net应该发送到正确的地方(或者@FranckPachot当然是推特)
更多推荐
所有评论(0)