Linux – How to debug a kernel module in which a NULL pointer appears

assemblycdebuggingkernel-moduleslinux

I have a custom kernel module that I compiled from this patch that adds support for the logitech G19 keyboard among other G series devices. I compiled it just fine against Ubuntu's maverick kernel's master branch (2.6.35).

I can boot and load the module, but I'm running into a really strange situation. As soon as I load the module (either on boot or through modprobe), I get a black screen and my console locks up.

The weird part is that it doesn't lock my system up, it's just the current console session. I can SSH into my box, and it gives me a terminal and a session. And I can type, and I can even run a command and it gives me the output. It then draws my next prompt and immediately locks up.

I see in dmesg that there's a null pointer, and I get the following stacktrace:

[  956.215836] input: Logitech G19 Gaming Keyboard as /devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2.1/1-2.1.2/1-2.1.2:1.1/input/input5
[  956.216023] hid-g19 0003:046D:C229.0004: input,hiddev97,hidraw3: USB HID v1.11 Keypad [Logitech G19 Gaming Keyboard] on usb-0000:00:1d.7-2.1.2/input1
[  956.216065] input: Logitech G19 as /devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2.1/1-2.1.2/1-2.1.2:1.1/input/input6
[  956.216128] Registered led device: g19_97:orange:m1
[  956.216146] Registered led device: g19_97:orange:m2
[  956.216178] Registered led device: g19_97:orange:m3
[  956.216198] Registered led device: g19_97:red:mr
[  956.216216] Registered led device: g19_97:red:bl
[  956.216235] Registered led device: g19_97:green:bl
[  956.216259] Registered led device: g19_97:blue:bl
[  956.216872] Console: switching to colour frame buffer device 40x30
[  956.216899] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
[  956.216903] IP: [<ffffffffa040b21b>] sys_imageblit+0x21b/0x4ec [sysimgblt]
[  956.216911] PGD 273554067 PUD 2726ca067 PMD 0 
[  956.216914] Oops: 0000 [#1] SMP 
[  956.216917] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2.1/1-2.1.2/1-2.1.2:1.1/usb/hiddev1/uevent
[  956.216921] CPU 5 
[  956.216922] Modules linked in: hid_g19(+) led_class hid_gfb fb_sys_fops sysimgblt sysfillrect syscopyarea btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs snd_hda_codec_atihdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device ioatdma snd i5000_edac soundcore snd_page_alloc psmouse edac_core i5k_amb shpchp serio_raw dca ppdev parport_pc lp parport usbhid hid floppy e1000e
[  956.216953] 
[  956.216956] Pid: 3147, comm: modprobe Not tainted 2.6.35-26-generic #46 DSBF-DE/System Product Name
[  956.216959] RIP: 0010:[<ffffffffa040b21b>]  [<ffffffffa040b21b>] sys_imageblit+0x21b/0x4ec [sysimgblt]
[  956.216963] RSP: 0018:ffff8802766db738  EFLAGS: 00010246
[  956.216965] RAX: 0000000000000000 RBX: ffff880273e71000 RCX: ffff880272e93b40
[  956.216968] RDX: 0000000000000007 RSI: 0000000000000010 RDI: ffff880272e93b40
[  956.216970] RBP: ffff8802766db7d8 R08: 0000000000000000 R09: ffff880272e93b98
[  956.216972] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[  956.216974] R13: 0000000000000010 R14: 0000000000000008 R15: ffff8802766db8c8
[  956.216977] FS:  00007fcae7725700(0000) GS:ffff880001f40000(0000) knlGS:0000000000000000
[  956.216979] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  956.216981] CR2: 000000000000001c CR3: 000000026ba26000 CR4: 00000000000006e0
[  956.216983] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  956.216986] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  956.216988] Process modprobe (pid: 3147, threadinfo ffff8802766da000, task ffff8802696a16e0)
[  956.216990] Stack:
[  956.216991]  ffff8802766db778 ffffffff810746ae ffff8802766db700 ffff88026b2cadc0
[  956.216994] <0> ffff8802766db778 ffffffff812beef9 ffff8802f66db947 ffff8802766db94f
[  956.216998] <0> ffff8802766db848 00000000812bf22e ffff880272e93b40 ffffffff812feb40
[  956.217001] Call Trace:
[  956.217011]  [<ffffffff810746ae>] ? send_signal+0x3e/0x90
[  956.217018]  [<ffffffff812beef9>] ? put_dec+0x59/0x60
[  956.217023]  [<ffffffff812feb40>] ? fbcon_resize+0xd0/0x230
[  956.217027]  [<ffffffffa04175da>] gfb_fb_imageblit+0x1a/0x30 [hid_gfb]
[  956.217031]  [<ffffffff813051b9>] soft_cursor+0x1c9/0x270
[  956.217034]  [<ffffffff81304e8b>] bit_cursor+0x65b/0x6c0
[  956.217037]  [<ffffffff812c1796>] ? vsnprintf+0x316/0x5a0
[  956.217043]  [<ffffffff81061045>] ? try_acquire_console_sem+0x15/0x60
[  956.217046]  [<ffffffff81300ca8>] fbcon_cursor+0x1d8/0x340
[  956.217049]  [<ffffffff81304830>] ? bit_cursor+0x0/0x6c0
[  956.217054]  [<ffffffff81368139>] hide_cursor+0x29/0x90
[  956.217057]  [<ffffffff8136b078>] redraw_screen+0x148/0x240
[  956.217060]  [<ffffffff8136b42e>] bind_con_driver+0x2be/0x3b0
[  956.217063]  [<ffffffff8136b569>] take_over_console+0x49/0x70
[  956.217066]  [<ffffffff812ff7fb>] fbcon_takeover+0x5b/0xb0
[  956.217069]  [<ffffffff81303ca5>] fbcon_event_notify+0x5c5/0x650
[  956.217076]  [<ffffffff8158e7f6>] notifier_call_chain+0x56/0x80
[  956.217080]  [<ffffffff8108510a>] __blocking_notifier_call_chain+0x5a/0x80
[  956.217084]  [<ffffffff81085146>] blocking_notifier_call_chain+0x16/0x20
[  956.217089]  [<ffffffff812f366b>] fb_notifier_call_chain+0x1b/0x20
[  956.217092]  [<ffffffff812f4c8c>] register_framebuffer+0x1ec/0x2e0
[  956.217098]  [<ffffffff814084f8>] ? usb_init_urb+0x28/0x40
[  956.217101]  [<ffffffffa041790f>] gfb_probe+0x21f/0x4f0 [hid_gfb]
[  956.217107]  [<ffffffffa0425778>] g19_probe+0x558/0xedc [hid_g19]
[  956.217115]  [<ffffffff811c059c>] ? sysfs_do_create_link+0xec/0x210
[  956.217128]  [<ffffffffa00330c7>] hid_device_probe+0x77/0xf0 [hid]
[  956.217131]  [<ffffffff81388aa2>] ? driver_sysfs_add+0x62/0x90
[  956.217134]  [<ffffffff81388bc8>] really_probe+0x68/0x190
[  956.217138]  [<ffffffff81388d35>] driver_probe_device+0x45/0x70
[  956.217140]  [<ffffffff81388dfb>] __driver_attach+0x9b/0xa0
[  956.217143]  [<ffffffff81388d60>] ? __driver_attach+0x0/0xa0
[  956.217146]  [<ffffffff81388008>] bus_for_each_dev+0x68/0x90
[  956.217149]  [<ffffffff81388a3e>] driver_attach+0x1e/0x20
[  956.217151]  [<ffffffff813882fe>] bus_add_driver+0xde/0x280
[  956.217154]  [<ffffffff81389140>] driver_register+0x80/0x150
[  956.217157]  [<ffffffff8158e7f6>] ? notifier_call_chain+0x56/0x80
[  956.217161]  [<ffffffffa042a000>] ? g19_init+0x0/0x20 [hid_g19]
[  956.217166]  [<ffffffffa0032913>] __hid_register_driver+0x53/0x90 [hid]
[  956.217169]  [<ffffffff81085115>] ? __blocking_notifier_call_chain+0x65/0x80
[  956.217173]  [<ffffffffa042a01e>] g19_init+0x1e/0x20 [hid_g19]
[  956.217178]  [<ffffffff8100204c>] do_one_initcall+0x3c/0x1a0
[  956.217184]  [<ffffffff8109bd9b>] sys_init_module+0xbb/0x200
[  956.217192]  [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
[  956.217195] Code: 83 e1 fc 48 89 4d c8 eb d3 8b 83 14 01 00 00 83 f8 04 74 09 83 f8 02 0f 85 7b 01 00 00 48 8b 4d b0 48 8b 83 00 04 00 00 8b 51 10 <44> 8b 04 90 8b 51 14 8b 3c 90 44 8b 4d ac 45 85 c9 75 16 41 b9 
[  956.217218] RIP  [<ffffffffa040b21b>] sys_imageblit+0x21b/0x4ec [sysimgblt]
[  956.217221]  RSP <ffff8802766db738>
[  956.217223] CR2: 000000000000001c
[  956.217227] ---[ end trace 95d6c6d6913ccc79 ]---

Can anyone point me in the right direction as to how to go about debugging this?

The stacktrace leads me to believe that it's not the hid-g15 driver but the hid-gfb driver, which creates a frame buffer for the LCD on the keyboard. This makes sense since it's locking up my display/console but digging into the kernel code isn't really going anywhere. So much of it is assembly and macro functions.

The last function on the stacktrace that involves my new code is gfb_fb_imageblit. The entirety of that function is

   struct gfb_data *par = info->par;
   sys_imageblit(info, image);
   gfb_fb_update(par);

Am I reading the stacktrace wrong? Am I missing something? Any tips on how to debug this?

Best Answer

First things first, debug the module? Just see if you can load it up in gdb it might point you straight at a line that uses the relevant variable(or close to it).

oh, and you might find this article useful

Related Question