Kdump简介

  • Kdump 是一种的新的crash dump捕获机制,用来捕获kernel crash时候产生的crash dump。Kdump需要配置两个不同目的的kernel,其中一个称作standard(production) kernel;另外一个称之为Crash(capture)kernel。
  • standard(production)kernel,是指正在使用的kernel,当standard kernel在使用的过程中出现crash的时候, kdump会切换到crash kernel, 简单来说,standard kernel在正运行时发生crash,而crash(capture) Kernel 会被用来捕获production kernel crash时候产生的crash dump。
  • 捕获crash dump是在新的crash(capture) kernel 的上下文中来捕获的,而不是在standard kernel上下文进行。
  • 具体是当standard kernel方式crash的时候,kdump通过kexec自动启动进入到crash kernel当中。如果启动了kdump服务,standard kernel会预留一部分内存, 这部分内存用来启动crash kernel。
    kdump机制主要包括两个组件:kdump和kexec:
  • kexec是一个快速启动机制,允许通过已经运行的内核的上下文启动一个Linux内核,不需要经过BIOS。BIOS可能会消耗很多时间,特别是带有众多数量的外设的大型服务器。这种办法可以为经常启动机器的开发者节省很多时间。Kexec是实现kdump机制的关键,它包括2个组成部分:一是内核空间的系统调用kexec_load,负责在生产内核(production kernel 或 first kernel)启动时将捕获内核(capture kernel或sencond kernel)加载到指定地址。二是用户空间的工具kexec-tools,他将捕获内核的地址传递给生产内核,从而在系统崩溃的时候能够找到捕获内核的地址并运行。没有kexec就没有kdump。先有kexec实现了在一个内核中可以启动另一个内核,才让kdump有了用武之地。
  • kdump是一种先进的基于kexec的内核崩溃转储机制。当系统崩溃时,kdump使用kexec 启动到第二个内核。第二个内核通常叫做捕获内核,以很小内存启动以捕获转储镜像。第一个内核保留了内存的一部分给第二内核启动用。由于kdump利用kexec启动捕获内核,绕过了 BIOS,所以第一个内核的内存得以保留。这是内核崩溃转储的本质。kdump需要两个不同目的的内核,生产内核和捕获内核。生产内核是捕获内核服务的对像。捕获内核会在生产内核崩溃时启动起来,与相应的ramdisk一起组建一个微环境,用以对生产内核下的内存进行收集和转存。注意,在启动时,kdump保留了一定数量的重要的内存,为了计算系统需要的真正最小内存,加上kdump使用的内存数量,以决定真正的最小内存的需求。
  • Kexec的设计是用新内核去覆盖原内核位置;而KDUMP是预留一块内存来加载第二个内核(和相关数据),Crash后第二个内核在原位置运行(不然就达不到相关目的了),收集第一个内核的相关内存信息。

Crash简介

当 linux 系统内核发生崩溃的时候,可以通过 kdump 等方式收集内核崩溃之前的内存,生成一个转储文件 vmcore。而后通过分析该 vmcore 文件就可以诊断出内核崩溃的原因,从而进行操作系统的代码改进。而 crash 就是一个被广泛使用的内核崩溃转储文件分析工具,掌握 crash 的使用技巧,对于定位问题有着十分重要的作用。
使用crash的先决条件:

  1. kernel 映像文件 vmlinux 在编译的时候必须指定了 -g 参数,即带有调试信息;
  2. 需要有一个内存崩溃转储文件(例如 vmcore),或者可以通过 /dev/mem 或 /dev/crash 访问的实时系统内存。如果 crash 命令行没有指定转储文件,则 crash 默认使用实时系统内存;
  3. crash 支持的平台处理器包括:x86, x86_64, ia64, ppc64, arm, s390, s390x ( 也有部分 crash 版本支持 Alpha 和 32-bit PowerPC,但是对于这两种平台的支持不保证长期维护 );
  4. crash 支持 2.2.5-15(含)以后的 Linux 内核版本。随着 Linux 内核的更新,crash 也在不断升级以适应新的内核。

Ubuntu下安装使用方法

在Ubuntu上进行kdump与crash的相关配置:
(1)内核配置支持kdump与kexec:
在内核中需要打开如下选项:

CONFIG_KEXEC=y
CONFIG_SYSFS=y
CONFIG_DEBUG_INFO=Y
CONFIG_CRASH_DUMP=y
CONFIG_PROC_VMCORE=y

(2)在Ubuntu上安装crash、kexec-tools等相关工具:
首先通过:apt-get install linux-crashdump 安装;
在安装完成后,需要调整crashkernel内存大小为768M(默认192M内存太小)
修改 /etc/defatul/grub.d/kdump-tools.cfg 中 crashkernel=512M-:768M
然后再重新生成grub.cfg:grub-mkconfig -o /boot/grub/grub.cfg
之后reboot,重启机器生效,在系统重新启动后可通过kdump-config show查看crash的相关配置及是否已然生效:
在这里插入图片描述(3)通过sysrq机制手动测试kdump是否正常工作:
首先在通过sysrq的指令来触发一次panic,来测试capture内核是否能够正常运行,这需要在Linux内核配置时,打开sysrq功能的支持,以及panic on oops选型的支持:
在这里插入图片描述
之后使用

echo c > /proc/sysrq-trigger

类似alt + sysrq + c键会触发一个空指针的kernel panic.
等待内核转储完毕,系统会重新启动,重启完成后会在/var/crash/[time]/目录下会生成dump.[time]的转储文件。

等待内核转储完毕,系统会重新启动,重启完成后会在/var/crash/[time]/目录下会生成dump.[time]的转储文件。
然后使用

crash /lib/modules/5.3.18+/build/vmlinuz-5.3.18+ dump.202201120843

进行crash分析.
(4)Crash的使用:
使用 crash 调试转储文件,需要在命令行输入两个参数:debug kernel 和 dump file,其中 dump file 是内核转储文件的名称,debug kernel 是由内核调试信息包安装的

Crash命令:crash <system-map-file> <vmlinux-file> coredump
常用参数选项:
-h:打印帮助信息
-d:设置调试级别
-S:使用 /boot/System.map 作为默认的映射文件
-s:不显示版本、初始调试信息等,直接进入命令行
-i file:启动之后自动运行 file 中的命令,再接受用户输入

Crash相关命令

crash基本用法

root@root-PC:/var/crash/202201120843# crash /lib/modules/5.3.18+/build/vmlinuz-5.3.18+ dump.202201120843

      KERNEL: /data/linux-source-5.3.0/vmlinux  //debug 内核
    DUMPFILE: dump.202204091433  [PARTIAL DUMP] //dump文件
        CPUS: 8  //CPU数量
        DATE: Sat Apr  9 14:33:07 2022  //发生dump的日期
      UPTIME: 00:04:03  //表示内核已正常运行的时间
LOAD AVERAGE: 2.31, 0.82, 0.29  //内核崩溃时的系统负载
       TASKS: 343 //内核崩溃时系统运行的任务数
    NODENAME: root-PC //主机名
     RELEASE: 5.3.18+ //内核版本
     VERSION: #2 SMP Sat Apr 9 14:26:36 CST 2022
     MACHINE: x86_64  (2494 Mhz)  //CPU架构与主频信息
      MEMORY: 16 GB  //发生内核崩溃的系统的内存大小
       PANIC: "Kernel panic - not syncing: sysrq triggered crash"  //内核崩溃的类型
         PID: 1341 //导致内核崩溃的进程号
     COMMAND: "bash"  //内核崩溃的进程名称,或命令
        TASK: ffff8ede003bdc00  [THREAD_INFO: ffff8ede003bdc00]  //内核崩溃的进程访问的内存地址
         CPU: 6  //表示导致内核崩溃的进程占用的 CPU 
       STATE: TASK_RUNNING (PANIC)  //表示导致内核崩溃的进程的运行状态
       

crash的基本命令

help:crash所提供的调试命令
crash> help

*              extend         log            rd             task
alias          files          mach           repeat         timer
ascii          foreach        mod            runq           tree
bpf            fuser          mount          search         union
bt             gdb            net            set            vm
btop           help           p              sig            vtop
dev            ipcs           ps             struct         waitq
dis            irq            pte            swap           whatis
eval           kmem           ptob           sym            wr
exit           list           ptov           sys            q

常用命令如下:

alias: 别名
ascii: 打印ASCII编码表,16进制转换为字符。
bpf: 加载eBPF 脚本程序。
eval: 进展转变,把数字按照各种格式打印出来。
btop: 地址转换为对应的页号
mod: 和lsmod的功能一样
log:查看日志信息,类似dmesg输出
crash> log
......
......
[   45.762046] wlan0: associated
[   45.784751] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   45.837528] wlan0: Limiting TX power to 30 (30 - 0) dBm as advertised by d4:68:ba:08:f8:7d
[   74.928113] sysrq: Trigger a crash
[   74.931668] Kernel panic - not syncing: sysrq triggered crash
[   74.937558] CPU: 6 PID: 1341 Comm: bash Kdump: loaded Tainted: P          IOE     5.3.18+ #2
[   74.946263] Hardware name: To be filled by O.E.M. To be filled by O.E.M./QM87, BIOS 4.6.5 11/10/2015
[   74.955630] Call Trace:
[   74.958167]  dump_stack+0x6d/0x95
[   74.961573]  panic+0xfe/0x2d4
[   74.964615]  sysrq_handle_crash+0x15/0x20
[   74.968753]  __handle_sysrq+0x93/0x150
[   74.972653]  write_sysrq_trigger+0x2f/0x40
[   74.976816]  proc_reg_write+0x3e/0x60
[   74.980504]  __vfs_write+0x1b/0x40
[   74.983943]  vfs_write+0xb1/0x1a0
[   74.987298]  ksys_write+0xa7/0xe0
[   74.990646]  __x64_sys_write+0x1a/0x20
[   74.994451]  do_syscall_64+0x5a/0x130
[   74.998151]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   75.003273] RIP: 0033:0x7f86862cb264
[   75.006886] Code: 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8d 05 a1 06 2e 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 41 54 55 49 89 d4 53 48 89 f5
[   75.026016] RSP: 002b:00007ffebd5172d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   75.033788] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f86862cb264
[   75.041084] RDX: 0000000000000002 RSI: 0000561b90df2590 RDI: 0000000000000001
[   75.048390] RBP: 0000561b90df2590 R08: 000000000000000a R09: 0000000000000001
[   75.055748] R10: 000000000000000a R11: 0000000000000246 R12: 00007f86865a7760
[   75.063097] R13: 0000000000000002 R14: 00007f86865a32a0 R15: 00007f86865a2760

bt:查看异常时候的堆栈信息

可以查看调用的代码行号

crash> bt -l
PID: 1341   TASK: ffff8ede003bdc00  CPU: 6   COMMAND: "bash"
 #0 [ffffb47b81953c58] machine_kexec at ffffffff8766b833
    /data/linux-source-5.3.0/arch/x86/kernel/machine_kexec_64.c: 441
 #1 [ffffb47b81953cb8] __crash_kexec at ffffffff8774bb82
    /data/linux-source-5.3.0/kernel/kexec_core.c: 957
 #2 [ffffb47b81953d88] panic at ffffffff8769778a
    /data/linux-source-5.3.0/./arch/x86/include/asm/smp.h: 72
 #3 [ffffb47b81953e10] sysrq_handle_crash at ffffffff87db3a85
    /data/linux-source-5.3.0/drivers/tty/sysrq.c: 140
 #4 [ffffb47b81953e20] __handle_sysrq at ffffffff87db3ec3
    /data/linux-source-5.3.0/drivers/tty/sysrq.c: 580
 #5 [ffffb47b81953e58] write_sysrq_trigger at ffffffff87db439f
    /data/linux-source-5.3.0/drivers/tty/sysrq.c: 1108
 #6 [ffffb47b81953e70] proc_reg_write at ffffffff879530fe
    /data/linux-source-5.3.0/fs/proc/inode.c: 238
 #7 [ffffb47b81953e90] __vfs_write at ffffffff878c86cb
    /data/linux-source-5.3.0/fs/read_write.c: 500
 #8 [ffffb47b81953ea0] vfs_write at ffffffff878c93b1
    /data/linux-source-5.3.0/fs/read_write.c: 584
 #9 [ffffb47b81953ed8] ksys_write at ffffffff878cb977
    /data/linux-source-5.3.0/fs/read_write.c: 638
#10 [ffffb47b81953f20] __x64_sys_write at ffffffff878cb9ca
    /data/linux-source-5.3.0/fs/read_write.c: 646
#11 [ffffb47b81953f30] do_syscall_64 at ffffffff876043ba
    /data/linux-source-5.3.0/arch/x86/entry/common.c: 296
#12 [ffffb47b81953f50] entry_SYSCALL_64_after_hwframe at ffffffff8840008c
    /data/linux-source-5.3.0/arch/x86/entry/entry_64.S: 184
    RIP: 00007f86862cb264  RSP: 00007ffebd5172d8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000002  RCX: 00007f86862cb264
    RDX: 0000000000000002  RSI: 0000561b90df2590  RDI: 0000000000000001
    RBP: 0000561b90df2590   R8: 000000000000000a   R9: 0000000000000001
    R10: 000000000000000a  R11: 0000000000000246  R12: 00007f86865a7760
    R13: 0000000000000002  R14: 00007f86865a32a0  R15: 00007f86865a2760
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
dev: 查看设备的情况

一切皆文件,fops,pci设备,disk io的状态信息。

crash> dev
CHRDEV    NAME                 CDEV        OPERATIONS
   1      mem            ffff9e944bd8a180  memory_fops
   2      pty            ffff9e944aca8380  tty_fops
   3      ttyp           ffff9e944aca8780  tty_fops
   4      /dev/vc/0      ffffffff97a01640  console_fops
   4      tty            ffff9e944bd8a780  tty_fops
   4      ttyS           ffff9e944aca8c80  tty_fops
   5      /dev/tty       ffffffff97a00540  tty_fops
   5      /dev/console   ffffffff97a004c0  console_fops
   5      /dev/ptmx      ffffffff97a006e0  ptmx_fops
   5      ttyprintk      ffff9e9448a9cb00  tty_fops
   7      vcs            ffff9e944bd8ad80  vcs_fops
  10      misc           ffff9e944b5ea680  misc_fops
  13      input               (none)
  21      sg             ffff9e944b4ebb00  sg_fops
  29      fb             ffff9e944b5ea600  fb_fops
  81      video4linux         (none)
  89      i2c            ffff9e944ad382e8  i2cdev_fops
 100      ts_config      ffff9e9448a9cb80  ts_config_fops
 116      alsa           ffff9e94485d3b00  snd_fops
 128      ptm            ffff9e944aca8100  tty_fops
 136      pts            ffff9e944aca8400  tty_fops
 166      ttyACM              (none)
 180      usb            ffff9e944b5aee80  usb_fops
 188      ttyUSB              (none)
 189      usb_device     ffffffff97a13060  usbdev_file_operations
 195      nvidia-frontend  ffff9e94485d3180  nv_frontend_fops
 226      drm            ffff9e944b27e080  drm_stub_fops
 238      nvidia-uvm     ffffffffc1bd3d40  uvm_fops
 239      nvidia-nvswitch  ffffffffc1523ec8  device_fops
 240      nvidia-nvlink  ffffffffc1523e28  nvlink_fops
 241      hidraw         ffffffff97a16f60  hidraw_ops
 242      ttyGS               (none)
 243      aux            ffff9e944b27ec00  auxdev_fops
 244      nvme                (none)
 245      ttynull        ffff9e9448a9c080  tty_fops
 246      bsg            ffffffff979f6be0  bsg_fops
 247      watchdog            (none)
 248      cec                 (none)
 249      rtc            ffff9e94474e9308  rtc_dev_fops
 250      dax                 (none)
 251      dimmctl        ffff9e944b5ae800  nvdimm_fops
 252      ndctl          ffff9e944b5aec00  nvdimm_bus_fops
 253      tpm                 (none)
 254      gpiochip            (none)

BLKDEV    NAME                GENDISK      OPERATIONS
 259      blkext              (none)
   7      loop           ffff9e9448488800  lo_fops
   8      sd             ffff9e944b73c800  sd_fops
  11      sr                  (none)
  65      sd                  (none)
  66      sd                  (none)
  67      sd                  (none)
dis:反汇编代码

通过地址信息查看代码,行号,反汇编函数等

crash> dis -s vfs_write+87
FILE: fs/read_write.c
LINE: 575

  570
  571           if (!(file->f_mode & FMODE_WRITE))
  572                   return -EBADF;
  573           if (!(file->f_mode & FMODE_CAN_WRITE))
  574                   return -EINVAL;
* 575           if (unlikely(!access_ok(buf, count)))
  576                   return -EFAULT;
  577
  578           ret = rw_verify_area(WRITE, file, pos, count);
  579           if (!ret) {
  580                   if (count > MAX_RW_COUNT)
  581                           count =  MAX_RW_COUNT;
  582                   file_start_write(file);
  583                   ret = __vfs_write(file, buf, count, pos);
  584                   if (ret > 0) {
  585                           fsnotify_modify(file);
  586                           add_wchar(current, ret);
  587                   }
  588                   inc_syscw(current);
  589                   file_end_write(file);
  590           }
  591
  592           return ret;
  593   }
  
  
files:查看进程打开的文件清单等

通过地址信息查看代码,行号,反汇编函数等

crash> files 3406
PID: 3406   TASK: ffff9e9404475c00  CPU: 1   COMMAND: "bash"
ROOT: /    CWD: /home/system/root
 FD       FILE            DENTRY           INODE       TYPE PATH
  0 ffff9e943ff3fd00 ffff9e93ed9f3600 ffff9e943543f840 CHR  /dev/pts/0
  1 ffff9e9441486900 ffff9e943ab88900 ffff9e943ab7a268 REG  /proc/sysrq-trigger
  2 ffff9e943ff3fd00 ffff9e93ed9f3600 ffff9e943543f840 CHR  /dev/pts/0
 10 ffff9e943ff3fd00 ffff9e93ed9f3600 ffff9e943543f840 CHR  /dev/pts/0
255 ffff9e943ff3fd00 ffff9e93ed9f3600 ffff9e943543f840 CHR  /dev/pts/0
fuser:打开文件的进程信息
crash> fuser /dev/nvidia-uvm
PID TASK COMM USAGE
3221 ffff88040af12340 "UltrasoundBack fd
3228 ffff88040b7c9780 "UltrasoundMain fd
3553 ffff88040a9c4680 "tmpdata" fd
3554 ffff8800d944bac0 "tmpdata" fd
3555 ffff8800d944e9c0 "tmpdata" fd
3556 ffff88040a9c0bc0 "tmpdata" fd
3576 ffff88040b7a4680 "tmpdata" fd
3577 ffff88040aa00000 "KVmSignal" fd
3578 ffff88040b7a3ac0 "KVmSignal" fd
3579 ffff88040b7a5240 "LogWriting" fd
3580 ffff88040aa02340 "LogWriting" fd
3581 ffff88040aa01780 "QXcbEventReade fd
3582 ffff88040b7a2f00 "QXcbEventReade fd
3585 ffff8800d9449780 "tmpdata" fd
3586 ffff880407e12340 "tmpdata" fd
3587 ffff8800d944af00 "tmpdata" fd
3588 ffff880407e10000 "tmpdata" fd
3589 ffff88040b0469c0 "tmpdata" fd
3590 ffff88040acd2340 "tmpdata" fd
3607 ffff88040aa069c0 "SonoIotDispatc fd
3608 ffff88040ab20000 "SonoIotClient" fd
3609 ffff88040af10000 "UltrasoundIO" fd
3610 ffff88040af12f00 "UltrasoundFile fd
3746 ffff880407e10bc0 "usb2int" fd
3748 ffff88040aa8c680 "KPeripheralMon fd
3874 ffff88040aa8de00 "screenbrightne fd
3877 ffff88040ab02340 "TouchscreenRec fd
3881 ffff88040ab00bc0 "PrintThread" fd
3889 ffff8803b626d240 "MonitorThread" fd
4088 ffff88040aa88000 "UltrasoundMain fd
4089 ffff88040aa8bac0 "UltrasoundMain fd
4090 ffff88040aa8e9c0 "KAutotestServe fd
4092 ffff88040aa8af00 "IoEventThread" fd
4093 ffff88040aa8a340 "httpthread" fd
4114 ffff8803b613af00 "KMwaThread" fd
4115 ffff8803b6278bc0 "IPostProcess2D fd
4116 ffff8803b6279780 "IPreProcess2D" fd
4117 ffff8803b627e9c0 "KPostProcess1D fd
4118 ffff8803b6278000 "KPreProcess1D" fd
4119 ffff8803b627a340 "KPostProcessWv fd
4120 ffff8803b627af00 "KImgParseThrea fd
4121 ffff8803b627bac0 "KImgSave" fd
4122 ffff8803b627c680 "KImgDisplayThr fd
4123 ffff8803b627d240 "KImgPick" fd
4124 ffff880392130000 "KPreProcess4D" fd
4125 ffff880392130bc0 "KPostProcess4D fd
4286 ffff880392131780 "FrontendHeader fd
4328 ffff8803b629d240 "KDicomServer" fd
4329 ffff8803b6299780 "KDcmCmdSender" fd
4330 ffff8803921c5e00 "UltrasoundMain fd
......

ipcs:系统中使用共享内存的情况
crash> ipcs
SHMID_KERNEL     KEY      SHMID      UID   PERMS BYTES      NATTCH STATUS
ffff9e9449917400 00000000 2          0     600   56800      2      dest
ffff9e9449917c00 00000000 3          0     600   4019200    2      dest
ffff9e940051aa00 00000000 4          0     600   90000      2      dest
ffff9e940051ae00 00000000 5          0     600   56800      2      dest
ffff9e940051a300 00000000 6          0     600   4019200    2      dest
ffff9e940051bf00 00000000 7          0     600   8294400    2      dest
ffff9e940051b900 00000000 8          0     600   773568     2      dest

SEM_ARRAY        KEY      SEMID      UID   PERMS NSEMS
ffff9e94367be000 00010001 0          0     666   1
ffff9e94367be400 00060002 1          0     666   1

MSG_QUEUE        KEY      MSQID      UID   PERMS USED-BYTES   MESSAGES
ffff9e9448556d00 0000232a 0          0     666   0            0
ffff9e9448556100 00002330 1          0     666   0            0
ffff9e9447d86700 00002329 2          0     666   0            0
ffff9e9441484900 00002331 3          0     666   0            0

irq: 查看中断号和中断资源的对应关系
crash> irq
 IRQ   IRQ_DESC/_DATA      IRQACTION      NAME
  0   ffff9e944bd4d800  ffffffff9722d840  "timer"
  1   ffff9e944bd4e800  ffff9e944ad06d80  "i8042"
  2   ffff9e944bd4d400      (unused)
  3   ffff9e944bd4cc00      (unused)
  4   ffff9e944bd4c000  ffff9e944b237900  "ttyS0"
  5   ffff9e944bd4c400      (unused)
  6   ffff9e944bd4ec00      (unused)
  7   ffff9e944bd4e200  ffff9e9435037800  "ttyS2"
  8   ffff9e944bd4de00  ffff9e944b237480  "rtc0"
  9   ffff9e944bd4f400  ffff9e944b5eab80  "acpi"
 10   ffff9e944bd4f200      (unused)
 11   ffff9e944bd4e000      (unused)
 12   ffff9e944bd4c600  ffff9e944ad06580  "i8042"
 13   ffff9e944bd4c200      (unused)
 14   ffff9e944bd4ce00      (unused)
 15   ffff9e944bd4dc00      (unused)
 16   ffff9e944b14dc00  ffff9e944b27e580  "ehci_hcd:usb1"
 17   ffff9e9448857000  ffff9e944b3ff800  "snd_hda_intel:card1"
 18   ffff9e944b0e6c00  ffff9e944ad06100  "i801_smbus"
 19   ffff9e944842b800      (unused)
 20       (unused)          (unused)
 21       (unused)          (unused)
 22   ffff9e9448855e00      (unused)
 23   ffff9e94485d8400  ffff9e944b27e480  "ehci_hcd:usb2"
 24   ffff9e944b14d000  ffff9e944b5ead00  "PCIe PME"
 25   ffff9e944b14ee00  ffff9e944b5eaf80  "PCIe PME"
 26   ffff9e9448857600  ffff9e944b3fff00  "PCIe PME"
 27   ffff9e944842bc00  ffff9e9448b4e080  "ahci[0000:00:1f.2]"
 28   ffff9e94485dac00  ffff9e944b237180  "xhci_hcd"
 29   ffff9e9449d55400  ffff9e944b237200  "eth0"
 30   ffff9e9449d55800  ffff9e944b237f80  "eth1"
 31   ffff9e9448855400  ffff9e944b3ff280  "snd_hda_intel:card0"
 32   ffff9e9447ce3600  ffff9e943eeb9f00  "nvidia"

mod:查看模块加载的情况
crash> mod
     MODULE       NAME                       SIZE  OBJECT FILE
ffffffffc01a5380  x_tables                  40960  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc01b0280  ip_tables                 32768  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc01e4140  sch_fq_codel              20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0200000  soundcore                 16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0210340  snd                       81920  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc021e0c0  snd_timer                 36864  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0224140  snd_seq_device            16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0229140  ie31200_edac              16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc022e040  snd_seq_midi_event        16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0233000  ledtrig_audio             16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc023a1c0  lpc_ich                   24576  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc02412c0  rtc_cmos                  24576  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0247180  snd_seq_midi              20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc025a340  snd_seq                   69632  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc027b640  psmouse                  131072  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0288240  snd_rawmidi               36864  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc02be000  e1000e                   253952  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc02dd580  snd_pcm                  102400  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc02f3040  snd_hda_core              90112  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc02feb80  intel_cstate              20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0305040  snd_intel_dspcfg          20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc0310e40  snd_hda_intel             53248  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc03214c0  snd_hda_codec_hdmi        57344  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc032a080  snd_hwdep                 20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc15222c0  nvidia                 20459520  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc16b3040  irqbypass                 16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc16b9100  coretemp                  20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc16bf140  intel_powerclamp          20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc16d83c0  snd_hda_codec            131072  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc17d6880  nvidia_modeset          1114112  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1803400  snd_hda_codec_generic     81920  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc181f500  snd_hda_codec_realtek    122880  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1839000  glue_helper               16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc183f100  cryptd                    24576  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1845040  crypto_simd               16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc184a440  ghash_clmulni_intel       16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc184f880  rapl                      20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc185c600  nvidia_drm                49152  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc18631c0  aes_x86_64                20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc18b4dc0  aesni_intel              372736  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc18c4240  crc32_pclmul              16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc18c9240  crct10dif_pclmul          16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc19450c0  kvm                      655360  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1995f00  kvm_intel                204800  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc19ea180  x86_pkg_temp_thermal      20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc19f6540  intel_rapl_common         24576  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1aad800  mac80211                 663552  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1ada780  rt2x00lib                 61440  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1afa600  rt2800lib                122880  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1b02180  rt2x00usb                 20480  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1b0a280  rt2800usb                 28672  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1b27140  usb_sono                  24576  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc1bd3980  nvidia_uvm               942080  (not loaded)  [CONFIG_KALLSYMS]

mount: 查看设备挂载情况
crash> mount
     MOUNT           SUPERBLK     TYPE   DEVNAME   DIRNAME
ffff9e944bdc6140 ffff9e944bc12800 rootfs rootfs    /
ffff9e944bdc6780 ffff9e9448667800 sysfs  sysfs     /sys
ffff9e944bdc7040 ffff9e944bc15800 proc   proc      /proc
ffff9e944bdc6000 ffff9e944b56a800 devtmpfs udev    /dev
ffff9e944bdc63c0 ffff9e9448660000 devpts devpts    /dev/pts
ffff9e944bdc6640 ffff9e9448664800 tmpfs  tmpfs     /run
ffff9e94483412c0 ffff9e944b7ad800 ext4   /dev/sda1 /
ffff9e944b00d2c0 ffff9e944b4c7000 securityfs securityfs /sys/kernel/security
ffff9e944b00da40 ffff9e9448fcc000 tmpfs  tmpfs     /dev/shm
ffff9e944b00de00 ffff9e9448fce000 tmpfs  tmpfs     /run/lock
ffff9e944b00cc80 ffff9e9448fcb800 tmpfs  tmpfs     /sys/fs/cgroup
ffff9e944b00dcc0 ffff9e9448fcb000 cgroup2 cgroup   /sys/fs/cgroup/unified
ffff9e944b00d680 ffff9e9448fcf800 cgroup cgroup    /sys/fs/cgroup/systemd
ffff9e944b00d7c0 ffff9e9448fc8000 pstore pstore    /sys/fs/pstore
ffff9e944b00c780 ffff9e9448fcc800 cgroup cgroup    /sys/fs/cgroup/freezer
ffff9e944b00d040 ffff9e944adb9800 cgroup cgroup    /sys/fs/cgroup/net_cls,net_prio
ffff9e944b00c000 ffff9e944adb9000 cgroup cgroup    /sys/fs/cgroup/rdma
ffff9e944b00c3c0 ffff9e944adbc000 cgroup cgroup    /sys/fs/cgroup/perf_event
ffff9e944b00c640 ffff9e944adbd800 cgroup cgroup    /sys/fs/cgroup/devices
ffff9e944b00c8c0 ffff9e944adbe000 cgroup cgroup    /sys/fs/cgroup/memory
ffff9e944b00d900 ffff9e944adbb800 cgroup cgroup    /sys/fs/cgroup/blkio
ffff9e944b00d540 ffff9e944adbb000 cgroup cgroup    /sys/fs/cgroup/pids
ffff9e944b00ca00 ffff9e944adbf800 cgroup cgroup    /sys/fs/cgroup/hugetlb
ffff9e944b00cf00 ffff9e944adb8000 cgroup cgroup    /sys/fs/cgroup/cpuset
ffff9e944b00c280 ffff9e944adbc800 cgroup cgroup    /sys/fs/cgroup/cpu,cpuacct
ffff9e944ac88500 ffff9e944848c800 hugetlbfs hugetlbfs /dev/hugepages
ffff9e94498a97c0 ffff9e944b4c2800 debugfs debugfs  /sys/kernel/debug
ffff9e944b0e2f00 ffff9e944b4c3000 rpc_pipefs sunrpc /run/rpc_pipefs
ffff9e94498a8780 ffff9e9448fc8800 mqueue mqueue    /dev/mqueue
ffff9e944ac88dc0 ffff9e944848d000 fusectl fusectl  /sys/fs/fuse/connections
ffff9e944ac89400 ffff9e944ac30000 configfs configfs /sys/kernel/config
ffff9e9448847180 ffff9e944ac36800 ramfs  none      /tmp
ffff9e944b0e2280 ffff9e944b454000 ext4   /dev/sda2 /var
ffff9e944bdc68c0 ffff9e9447cba800 ext4   /dev/sda3 /home
ffff9e944bdc7540 ffff9e9447cb9800 ext4   /dev/sda4 /data

net: 网络配置信息
crash> net
   NET_DEVICE     NAME   IP ADDRESS(ES)
ffff9e944b3e2000  lo     127.0.0.1
ffff9e943f75c000  eth0   192.168.33.3
ffff9e943f524000  eth1   192.0.2.2
ffff9e94499f6000  wlan0  192.168.93.102

ps: 查看系统的进程状态
crash> ps
   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
>     0      0   0  ffffffff97213780  RU   0.0       0      0  [swapper/0]
      0      0   1  ffff9e944bea9700  RU   0.0       0      0  [swapper/1]
>     0      0   2  ffff9e944beaae00  RU   0.0       0      0  [swapper/2]
>     0      0   3  ffff9e944bff1700  RU   0.0       0      0  [swapper/3]
>     0      0   4  ffff9e944bff2e00  RU   0.0       0      0  [swapper/4]
>     0      0   5  ffff9e944bff0000  RU   0.0       0      0  [swapper/5]
>     0      0   6  ffff9e944bff4500  RU   0.0       0      0  [swapper/6]
>     0      0   7  ffff9e944bff5c00  RU   0.0       0      0  [swapper/7]
      1      0   7  ffff9e944be80000  IN   0.1   78732  10020  systemd
      2      0   5  ffff9e944be84500  IN   0.0       0      0  [kthreadd]
      3      2   0  ffff9e944be85c00  ID   0.0       0      0  [rcu_gp]
      4      2   0  ffff9e944be81700  ID   0.0       0      0  [rcu_par_gp]
      5      2   0  ffff9e944be82e00  ID   0.0       0      0  [kworker/0:0]
      6      2   0  ffff9e944bea4500  ID   0.0       0      0  [kworker/0:0H]
      7      2   0  ffff9e944bea5c00  ID   0.0       0      0  [kworker/0:1]
      8      2   7  ffff9e944bea1700  ID   0.0       0      0  [kworker/u16:0]
      9      2   0  ffff9e944bea2e00  ID   0.0       0      0  [mm_percpu_wq]

runq: CPU上的运行队列情况
crash> runq
CPU 0 RUNQUEUE: ffff9e944da2a6c0
  CURRENT: PID: 0      TASK: ffffffff97213780  COMMAND: "swapper/0"
  RT PRIO_ARRAY: ffff9e944da2a980
     [no tasks queued]
  CFS RB_ROOT: ffff9e944da2a7f0
     [no tasks queued]

CPU 1 RUNQUEUE: ffff9e944da6a6c0
  CURRENT: PID: 3406   TASK: ffff9e9404475c00  COMMAND: "bash"
  RT PRIO_ARRAY: ffff9e944da6a980
     [no tasks queued]
  CFS RB_ROOT: ffff9e944da6a7f0
     [no tasks queued]

CPU 2 RUNQUEUE: ffff9e944daaa6c0
  CURRENT: PID: 0      TASK: ffff9e944beaae00  COMMAND: "swapper/2"
  RT PRIO_ARRAY: ffff9e944daaa980
     [no tasks queued]
  CFS RB_ROOT: ffff9e944daaa7f0
     [no tasks queued]

sym: 查看符号列表信息
crash> sym usb
symbol not found: usb
possible alternatives:
  ffffffff95aa7cd0 (T) kill_pid_usb_asyncio
  ffffffff960d11c0 (t) quirk_gpu_usb_typec_ucsi
  ffffffff960d11e0 (t) quirk_gpu_usb
  ffffffff962f7430 (T) usb_ep_type_string
  ffffffff962f7460 (T) usb_otg_state_string
  ffffffff962f7490 (T) usb_speed_string
  ffffffff962f74c0 (T) usb_state_string
  ffffffff962f74f0 (T) usb_get_maximum_speed
  ffffffff962f7560 (T) usb_get_dr_mode
  ffffffff962f75d0 (T) usb_disabled
  ffffffff962f76a0 (T) usb_find_common_endpoints
  ffffffff962f7760 (T) usb_find_common_endpoints_reverse
  ffffffff962f7840 (T) usb_ifnum_to_if
  ffffffff962f78a0 (T) usb_altnum_to_altsetting
  ffffffff962f7920 (t) usb_dev_prepare
  ffffffff962f7930 (T) __usb_get_extra_descriptor
  ffffffff962f79b0 (T) usb_find_interface
  ffffffff962f7a30 (T) usb_put_dev
  ffffffff962f7a50 (T) usb_put_intf
  ffffffff962f7a70 (T) usb_for_each_dev
  ffffffff962f7ad0 (t) usb_dev_restore
  ffffffff962f7af0 (t) usb_dev_thaw
  ffffffff962f7b10 (t) usb_dev_resume
  ffffffff962f7b30 (t) usb_dev_poweroff

sys: 查看所有系统调用的实现
crash> sys -c
NUM  SYSTEM CALL                FILE AND LINE NUMBER
  0  __x64_sys_read             ../fs/read_write.c: 621
  1  __x64_sys_write            ../fs/read_write.c: 646
  2  __x64_sys_open             ../fs/open.c: 1104
  3  __x64_sys_close            ../fs/open.c: 1185
  4  __x64_sys_newstat          ../fs/stat.c: 337
  5  __x64_sys_newfstat         ../fs/stat.c: 375
  6  __x64_sys_newlstat         ../fs/stat.c: 348
  7  __x64_sys_poll             ../fs/select.c: 1047
  8  __x64_sys_lseek            ../fs/read_write.c: 322
  9  __x64_sys_mmap             ../arch/x86/kernel/sys_x86_64.c: 91
 10  __x64_sys_mprotect         ../mm/mprotect.c: 613
 11  __x64_sys_munmap           ../mm/mmap.c: 2869
 12  __x64_sys_brk              ../mm/mmap.c: 187
 13  __x64_sys_rt_sigaction     ../kernel/signal.c: 4234
 14  __x64_sys_rt_sigprocmask   ../kernel/signal.c: 3019
 15  __ia32_sys_rt_sigreturn    ../arch/x86/kernel/signal.c: 641
 16  __x64_sys_ioctl            ../fs/ioctl.c: 718
 17  __x64_sys_pread64          ../fs/read_write.c: 672
 18  __x64_sys_pwrite64         ../fs/read_write.c: 698
 19  __x64_sys_readv            ../fs/read_write.c: 1148
 20  __x64_sys_writev           ../fs/read_write.c: 1154
 21  __x64_sys_access           ../fs/open.c: 452
 22  __x64_sys_pipe             ../fs/pipe.c: 876
 23  __x64_sys_select           ../fs/select.c: 722
 24  __ia32_sys_sched_yield     ../kernel/sched/core.c: 5459
 25  __x64_sys_mremap           ../mm/mremap.c: 595
 26  __x64_sys_msync            ../mm/msync.c: 32
 27  __x64_sys_mincore          ../mm/mincore.c: 253
 28  __x64_sys_madvise          ../mm/madvise.c: 813
 29  __x64_sys_shmget           ../ipc/shm.c: 745
 30  __x64_sys_shmat            ../ipc/shm.c: 1591
 31  __x64_sys_shmctl           ../ipc/shm.c: 1194
 32  __x64_sys_dup              ../fs/file.c: 979
 33  __x64_sys_dup2             ../fs/file.c: 949

task: 显示task_struct结构体的变量情况
crash> task
PID: 3406   TASK: ffff9e9404475c00  CPU: 1   COMMAND: "bash"
struct task_struct {
  thread_info = {
    flags = 2147483648,
    status = 0
  },
  state = 0,
  stack = 0xffffb4a581d08000,
  usage = {
    refs = {
      counter = 2
    }
  },
  flags = 4194560,
  ptrace = 0,
  wake_entry = {
    next = 0x0
  },
  on_cpu = 1,
  cpu = 1,
  wakee_flips = 2,
  wakee_flip_decay_ts = 4294925859,
  last_wakee = 0xffff9e944b1a8000,
  recent_used_cpu = 1,
  wake_cpu = 1,
  on_rq = 1,
  prio = 120,
  static_prio = 120,
  normal_prio = 120,
  rt_priority = 0,
  sched_class = 0xffffffff96c3b000,
  se = {
    load = {
      weight = 1048576,
      inv_weight = 4194304
    },
    runnable_weight = 1048576,
    run_node = {
      __rb_parent_color = 1,
      rb_right = 0x0,
      rb_left = 0x0
    },
    group_node = {
      next = 0xffff9e944da6b190,
      prev = 0xffff9e944da6b190
    },

timer:定时器的状态信息
crash> timer
JIFFIES
4294925895

TIMER_BASES[0][BASE_STD]: ffff9e944da1b600
  EXPIRES        TTE        TIMER_LIST     FUNCTION
  4294925938      43  ffffffff97399ac0  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294925955      60  ffff9e943f52d448  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294926012     117  ffff9e9449012448  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294926299     404  ffff9e943f52f448  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294967500   41605  ffff9e944da16c20  ffffffff95a4b710  <mce_timer_fn>
TIMER_BASES[0][BASE_DEF]: ffff9e944da1c880
  EXPIRES        TTE        TIMER_LIST     FUNCTION
  4294926000     105  ffffffff972e1740  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294926000     105  ffff9e944da279a0  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294967411   41516  ffff9e944da2a088  ffffffff95ab7180  <idle_worker_timeout>

TIMER_BASES[1][BASE_STD]: ffff9e944da5b600
  EXPIRES        TTE          TIMER_LIST     FUNCTION
  4294925997       102  ffff9e94389ca348  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294926115       220  ffff9e944b0e4850  ffffffff9648f030  <neigh_timer_handler>
  4294926224       329  ffff9e944b3e1448  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294967497     41602  ffff9e944da56c20  ffffffff95a4b710  <mce_timer_fn>
  4296714848   1788953  ffff9e9437ef2d58  ffffffff96517b70  <tcp_keepalive_timer>
TIMER_BASES[1][BASE_DEF]: ffff9e944da5c880
  EXPIRES        TTE        TIMER_LIST     FUNCTION
  4294929383    3488  ffffffff97440ea8  ffffffff95ab5ac0  <delayed_work_timer_fn>
  4294967435   41540  ffff9e944bc18c48  ffffffff95ab7180  <idle_worker_timeout>
  4294968970   43075  ffff9e944da6a088  ffffffff95ab7180  <idle_worker_timeout>

应用场景

log + bt + dis

使用log查看日志,bt查看堆栈情况,dis查看异常时的代码行及反汇编代码。

dev+irq+mod+mount+net+ps+runq+ipcs

查看系统的设备列表,中断分配,模块使用情况,网络配置情况,进程状态,cpu上的运行队列信 息,ipcs共享内存使用情况

Logo

更多推荐