异常内核模块

内核模块文件oops.c

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>

static void create_oops(void)
{
	*(int *)0 = 0;
}
static int __init my_oops_init(void)
{
	printk("oops from the module\n");
	create_oops();
	return (0);
}
static void __exit my_oops_exit(void)
{
	printk("Goodbye world\n");
}
module_init(my_oops_init);
module_exit(my_oops_exit);
MODULE_LICENSE("GPL");

编译内核模块Makefile

EXTRA_CFLAGS  += -g
obj-m += oops.o

all:
	$(MAKE)  -C /lib/modules/$(shell uname -r)/build M=$(PWD)
clean:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
	rm -f modules.order Module.symvers Module.markers

备注:

        Makefile中EXTRA_CFLAGS += -g需要增加

编译内核模拟oops

安装模块insmod oops.ko,出现killed,由于故障不严重,未导致系统死机重启

dmesg查看信息可以看到Oops信息如下

[  933.023020] oops: loading out-of-tree module taints kernel.
[  933.023046] oops: module verification failed: signature and/or required key missing - tainting kernel
[  933.023611] oops from the module
[  933.023615] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[  933.023621] IP: my_oops_init+0x15/0x1000 [oops]
[  933.023622] PGD 0 P4D 0 
[  933.023624] Oops: 0002 [#1] SMP PTI
[  933.023626] Modules linked in: oops(OE+) vmw_vsock_vmci_transport vsock crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_ens1371 snd_ac97_codec gameport aesni_intel ac97_bus snd_pcm aes_x86_64 crypto_simd snd_seq_midi glue_helper snd_seq_midi_event joydev cryptd snd_rawmidi vmw_balloon input_leds serio_raw snd_seq snd_seq_device intel_rapl_perf shpchp snd_timer mac_hid snd i2c_piix4 soundcore vmw_vmci parport_pc ppdev lp parport autofs4 hid_generic usbhid hid psmouse vmwgfx ttm drm_kms_helper ahci libahci syscopyarea sysfillrect mptspi sysimgblt fb_sys_fops pcnet32 mii drm mptscsih mptbase scsi_transport_spi pata_acpi
[  933.023642] CPU: 1 PID: 4198 Comm: insmod Tainted: G           OE    4.15.0-123-generic #126~16.04.1-Ubuntu
[  933.023643] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/29/2019
[  933.023645] RIP: 0010:my_oops_init+0x15/0x1000 [oops]
[  933.023646] RSP: 0018:ffffa90044b67c68 EFLAGS: 00010286
[  933.023647] RAX: 0000000000000014 RBX: ffffffffc0645000 RCX: 0000000000000006
[  933.023648] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff8ed839656490
[  933.023649] RBP: ffffa90044b67c68 R08: 0000000000021d54 R09: ffff8ed83fec8000
[  933.023650] R10: ffffe45282d98e40 R11: 000000000000062d R12: ffffffffc0648000
[  933.023651] R13: 0000000000000000 R14: 0000000000000001 R15: ffff8ed7a4ce3b40
[  933.023652] FS:  00007fc14dfbe700(0000) GS:ffff8ed839640000(0000) knlGS:0000000000000000
[  933.023653] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  933.023654] CR2: 0000000000000000 CR3: 00000000a4d48001 CR4: 00000000003606e0
[  933.023674] Call Trace:
[  933.023680]  do_one_initcall+0x55/0x1ac
[  933.023683]  ? _cond_resched+0x1a/0x50
[  933.023686]  ? kmem_cache_alloc_trace+0x165/0x1c0
[  933.023690]  do_init_module+0x5f/0x223
[  933.023692]  load_module+0x188c/0x1ea0
[  933.023696]  ? ima_post_read_file+0x83/0xa0
[  933.023698]  SYSC_finit_module+0xe5/0x120
[  933.023700]  ? SYSC_finit_module+0xe5/0x120
[  933.023702]  SyS_finit_module+0xe/0x10
[  933.023703]  do_syscall_64+0x73/0x130
[  933.023706]  entry_SYSCALL_64_after_hwframe+0x41/0xa6
[  933.023707] RIP: 0033:0x7fc14daec599
[  933.023708] RSP: 002b:00007ffc96df4ce8 EFLAGS: 00000206 ORIG_RAX: 0000000000000139
[  933.023709] RAX: ffffffffffffffda RBX: 0000563624a281f0 RCX: 00007fc14daec599
[  933.023710] RDX: 0000000000000000 RSI: 0000563622b0426b RDI: 0000000000000003
[  933.023711] RBP: 0000563622b0426b R08: 0000000000000000 R09: 00007fc14ddb1ea0
[  933.023712] R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000000000
[  933.023712] R13: 0000563624a27130 R14: 0000000000000000 R15: 0000000000000000
[  933.023714] Code: <c7> 04 25 00 00 00 00 00 00 00 00 31 c0 5d c3 00 00 00 00 00 00 00 
[  933.023722] RIP: my_oops_init+0x15/0x1000 [oops] RSP: ffffa90044b67c68
[  933.023723] CR2: 0000000000000000
[  933.023725] ---[ end trace b763de4574e1f5e6 ]---

问题分析

Oops信息分析

[  933.023624] Oops: 0002 [#1] SMP PTI

0002 :是错误码

#1 :Oops发生的次数 

This is the error code value in hex. Each bit has a significance of its own:

  • bit 0 == 0 means no page found, 1 means a protection fault
  • bit 1 == 0 means read, 1 means write
  • bit 2 == 0 means kernel, 1 means user-mode
  • [#1] — this value is the number of times the Oops occurred. Multiple Oops can be triggered as a cascading effect of the first one.
[  933.023642] CPU: 1 PID: 4198 Comm: insmod Tainted: G           OE    4.15.0-123-generic #126~16.04.1-Ubuntu

 这个表示Oops是发生在CPU1上,当运行进程4198 insmod的时候出现的问题 

关键信息如下,这里提示在操作函数my_oops_init的时候出现异常,偏移地址0x15

[  933.023621] IP: my_oops_init+0x15/0x1000 [oops]

 问题定位

从以上分析可以看出,最关键信息如下:

[  933.023621] IP: my_oops_init+0x15/0x1000 [oops]
[  933.023722] RIP: my_oops_init+0x15/0x1000 [oops] RSP: ffffa90044b67c68

由此可以看出内核执行到my_oops_init+0x15/0x1000这个地址的时候出现异常,我们只需要找到这个地址对应的代码即可

格式为 +偏移/长度
my_oops_init指示了实在my_oops_init中出现的异常
0x15表示出错的偏移位置
0x1000表示my_oops_init函数的大小 

方法一:gdb直接查看异常代码

由于是驱动出现的问题, 那么gdb直接调试驱动的 ko 文件, 如果是源内核出现的 OOPS, 那么只能用 gdb 对 vmlinux(内核根目录下) 文件进行调试 

gdb oops.ko或者gdb oops.o
l*(my_oops_init+0x15)

格式:
    l*(函数名+偏移地址)或者(函数入口地址+偏移地址)

备注:

       ko文件一定要是和报错的是ko是同一个,如果编译到内核可以使用.o文件,如下所示

可以看到gdb提示是第7行代码异常,此方法也适合内核Oops,只需要将ko文件替换为vmlinux文件即可

方法二:gdb反汇编代码获取地址直接转化对应源码

对于驱动来说, 可以从/sys/module/对应驱动名称/sections/.init.text 查找到对应的地址信息

cat /sys/module/oops/sections/.init.text
0xffffffffc0648000
gdb oops.ko
#加符号文件添加到调试器
add-symbol-file oops.o 0xffffffffc0648000
#将my_oops_init函数反汇编得到虚拟地址信息
disassemble my_oops_init
#列出对应代码信息
l*(函数名+偏移地址)

方法三:objdump反汇编

objdump -S oops.o

从报错log可以找到出错的机器码如下:

[  933.023714] Code: <c7> 04 25 00 00 00 00 00 00 00 00 31 c0 5d c3 00 00 00 00 00 00 00 

参考资料

https://cloud.tencent.com/developer/article/1463579

 https://kernel.blog.csdn.net/article/details/73715860

 

Logo

更多推荐