1、概述

实现Linuxkernel crashdump的保存,包括两部分。当kernel异常后,需要系统重启,且重启方式和正常的重启(上电重启)区别开;当系统重启后,判断重启的原因,如果是内核中设置的重启模式,则把RAM保存成ELF格式的文件,以使能使用Linux提供的crash分析工具。


2kernelcrash发生后的过程


Fig1

当数据访问异常时,kernelcrash发生后的流程如图Fig1所示,其他的异常处理和它类似。

根据当前的异常处理模式,调用do_dataAbort,在该函数中根据faultstatus register 得到具体的dataabort类型,进而调用__do_kernel_fault,该函数进而调用die.die包含两部分:打印oops__die和启动复位的crash_kexec.

oops的打印内容是通过函数__die控制的。


2.1  oops

我们常见的当crash发生后的打印信息就是oops,内容包括如下:

[ 86.456972:1] Unable to handle kernel NULL pointer dereference atvirtual address 00000000


show_pte:

[ 86.465203:1] pgd = ecad0000

[ 86.468073:1] [00000000] *pgd=00000000

[ 86.471825:1] Internal error: Oops: 805 [#1] PREEMPT SMP ARM

[ 86.477461:1] Modules linked in:

show_regs

[ 86.480691:1] CPU: 1 Not tainted (3.4.0-gc37fe8c #648)

[ 86.486163:1] PC is at sysrq_handle_crash+0x38/0x48

[ 86.491027:1] LR is at _raw_spin_unlock_irqrestore+0x20/0x40

[ 86.496666:1] pc : [<c01ed158>] lr : [<c0569e0c>] psr: 60000093

[ 86.496672:1] sp : e0011ec8 ip : e0011e98 fp : e0011ed4

[ 86.508454:1] r10: e0011f70 r9 : e0010000 r8 : 00000000

[ 86.513832:1] r7 : 60000013 r6 : 00000063 r5 : 00000004 r4 :c079fc74

[ 86.520506:1] r3 : 00000000 r2 : 00000001 r1 : 20000093 r0 :00000001

[ 86.527181:1]Flags: nZCv IRQsoff FIQs on Mode SVC_32 ISA ARM Segment user

[ 86.534548:1] Control: 10c53c7d Table: aead004a DAC: 00000015


[ 86.540446:1]PC: 0xc01ed0d8:


[ 86.545053:1] d0d8 e28800e4 e1a01007 e3a02008 ebff1b75 e1a0000aeb0de7f5 e3a00000 e89daff8

[ 86.553483:1] d0f8 e1a0c00d e92dd800 e24cb004 e59f3010 e5932000e3520000 13a00001 05d30004


[ 86.612492:1]LR: 0xc0569d8c:

[ 86.617100:1] 9d8c e92dd818 e24cb004 e1a0400e ebf16ad3 e3a00001ebeb8b6c e1a00004 ebeb0b9c

[ 86.625527:1] 9dac e89da818 e1a0c00d e92dd800 e24cb004 ebf16acaf1080080 e3a00001 ebeb8b62


[ 86.684534:1]SP: 0xe0011e48:


[ 86.689141:1] 1e48 342e3638 39323335 5d313a36 e0010020 e0011e8ce0011e68 c01ed158 60000093

[ 86.697572:1] 1e68 ffffffff e0011eb4 e0011ed4 e0011e80 c000dbd8c000839c 00000001 20000093

dump_mem(stack)

[ 87.116875:1] Process sh (pid: 2320, stack limit = 0xe00102f0)

[ 87.122684:1] Stack: (0xe0011ec8 to 0xe0012000)

[ 87.127199:1] 1ec0: e0011efc e0011ed8 c01ed838c01ed12c e0011f70 00000002

[ 87.135519:1] 1ee0: c01ed8e4 d9a5a968 00000002 b750cd24 e0011f14e0011f00 c01ed914 c01ed798

[ 87.143839:1] 1f00: e0011f70 ed98b5e0 e0011f3c e0011f18 c00f7948c01ed8f0 00000002 d9a5a968

dump_backtrace

[ 87.210392:1] Backtrace:

[87.213023:1][<c01ed120>] (sysrq_handle_crash+0x0/0x48) from [<c01ed838>](__handle_sysrq+0xac/0x158)

[87.222295:1][<c01ed78c>] (__handle_sysrq+0x0/0x158) from [<c01ed914>](write_sysrq_trigger+0x30/0x38)

[ 87.231648:1] r8:b750cd24 r7:00000002 r6:d9a5a968 r5:c01ed8e4r4:00000002

[ 87.238375:1] r3:e0011f70

[ 87.241191:1] [<c01ed8e4>] (write_sysrq_trigger+0x0/0x38) from[<c00f7948>] (proc_reg_write+0x88/0x9c)

[ 87.250458:1] r4:ed98b5e0 r3:e0011f70

[ 87.254224:1] [<c00f78c0>] (proc_reg_write+0x0/0x9c) from[<c00b3384>] (vfs_write+0xb8/0x144)

[ 87.262718:1] [<c00b32cc>] (vfs_write+0x0/0x144) from[<c00b34d4>] (sys_write+0x44/0x70)

[ 87.270773:1] r8:00000002 r7:00000000 r6:00000000 r5:b750cd24r4:d9a5a968

[ 87.277694:1] [<c00b3490>] (sys_write+0x0/0x70) from[<c000e040>] (ret_fast_syscall+0x0/0x30)

[ 87.286183:1] r8:c000e1e8 r7:00000004 r6:00000001 r5:00000002r4:00000003


dump_instr

[ 87.293099:1] Code: 0a000000 e12fff33 e3a03000 e3a02001 (e5c32000)


2.2 crash_kexec

该函数中会调用平台相关的复位函数machine_crash_swresetlog如下所示:

[ 87.299341:1] Enter crash kexec !!

[ 87.302745:0] CPU 0 will stop doing anything useful since anotherCPU has crashed

[ 87.310903:1] Loading crashdump kernel...

[ 87.314899:1] Software reset on panic!


3u-boot中保存kernelcrash dump


Fig2

当系统重启后和kernelcrash dump相关的流程如图Fig2所示,根据当前的复位原因,判断是否为kernelcrash时设置的复位,如果是则进入保存kernelcrash dump的流程。保存crashdump的过程,大概分为三部分:

1】得到bootargs,从中得到内核的开始地址和大小;

2】根据内核的开始地址和大小,生成elfheader;

3】根据生成的elfheader,把内核使用的RAM保存成elf格式的文件。


readelf-h cdump_0.elf

ELFHeader:

Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00

Class: ELF32

Data: 2's complement, little endian

Version: 1 (current)

OS/ABI: UNIX - System V

ABIVersion: 0

Type: CORE (Core file)

Machine: ARM

Version: 0x1

Entrypoint address: 0x0

Startof program headers: 52 (bytes into file)

Startof section headers: 0 (bytes into file)

Flags: 0x0

Sizeof this header: 52 (bytes)

Sizeof program headers: 32 (bytes)

Numberof program headers: 2

Sizeof section headers: 0 (bytes)

Numberof section headers: 0

Sectionheader string table index: 0

Logo

更多推荐