[Click] Help with kernel OOPS

Vivek raghunathan vivek.raghunathan at gmail.com
Wed Sep 27 15:04:06 EDT 2006


Eddie,

The following script also generates an oops on uninstall.

AddressInfo(MyEther 00:11:25:2D:7D:33, RemoteEther 00:11:25:47:EA:7B,
           MyIP 10.1.1.2/8, RemoteIP 10.1.1.1/8, BroadcastAddr 10.255.255.255);

splsrc::InfiniteSource(
 DATA \<aa bb cc dd ee ff>, LIMIT -1, STOP true);
 splsrc -> ipenc::IPEncap(222, MyIP, BroadcastAddr);
 ipenc -> ethenc::EtherEncap(0x0800, ff:ff:ff:ff:ff:ff, MyEther);
 ethenc -> q2::Queue;
 q2 -> ToDevice(eth0);

Vivek


On 9/27/06, Vivek raghunathan <vivek.raghunathan at gmail.com> wrote:
> Eddie,
>
> Here's a script without FromHost that generates an oops. This oops
> doesn't hang the machine though ...
>
> AddressInfo(MyEther 00:11:25:2D:7D:33, RemoteEther 00:11:25:47:EA:7B,
>           MyIP 10.1.1.2/8, RemoteIP 10.1.1.1/8, BroadcastAddr 10.255.255.255);
>
> FromDevice(eth0) -> SetPacketType(HOST) -> ToHost(eth0);
>
> splsrc::InfiniteSource(
> DATA \<aa bb cc dd ee ff>, LIMIT -1, STOP true);
> splsrc -> ipenc::IPEncap(222, MyIP, BroadcastAddr);
> ipenc -> ethenc::EtherEncap(0x0800, ff:ff:ff:ff:ff:ff, MyEther);
> ethenc -> q2::Queue;
> q2 -> ToDevice(eth0);
>
> -Vivek
>
>
>
> Sep 27 13:46:28 localhost kernel: [4294861.518000] Unable to handle
> kernel NULL pointer dereference at virtual address 00000000
> Sep 27 13:46:28 localhost kernel: [4294861.518000]  printing eip:
> Sep 27 13:46:28 localhost kernel: [4294861.518000] d124f881
> Sep 27 13:46:28 localhost kernel: [4294861.518000] *pde = 00000000
> Sep 27 13:46:28 localhost kernel: [4294861.518000] Oops: 0000 [#1]
> Sep 27 13:46:28 localhost kernel: [4294861.518000] PREEMPT
> Sep 27 13:46:28 localhost kernel: [4294861.518000] Modules linked in:
> click proclikefs rfcomm l2cap bluetooth nvram uinput ppdev radeon drm
> speedstep_centrino cpufreq_userspace cpufreq_stats freq_table
> cpufreq_powersave cpufreq_ondemand cpufreq_conservative video ibm_acpi
> container button battery ac ipv6 dm_mod md_mod lp af_packet airo_cs
> airo pcmcia joydev tsdev e100 ipw2200 mii ide_cd cdrom ieee80211
> ieee80211_crypt yenta_socket rsrc_nonstatic pcmcia_core snd_intel8x0
> snd_ac97_codec snd_ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm
> snd_timer hw_random psmouse snd soundcore parport_pc parport ehci_hcd
> uhci_hcd shpchp pci_hotplug usbcore serio_raw pcspkr floppy
> snd_page_alloc rtc intel_agp agpgart evdev ext3 jbd mbcache ide_disk
> ide_generic via82cxxx trm290 triflex slc90e66 sis5513 siimage
> serverworks sc1200 rz1000 piix pdc202xx_old pdc202xx_new opti621
> ns87415 it821x hpt366 hpt34x generic cy82c693 cs5535 cs5530 cs5520
> cmd64x atiixp amd74xx alim15x3 aec62xx thermal processor fan
> Sep 27 13:46:28 localhost kernel: [4294861.518000] CPU:    0
> Sep 27 13:46:28 localhost kernel: [4294861.518000] EIP:
> 0060:[pg0+267880577/1053979648]    Not tainted VLI
> Sep 27 13:46:28 localhost kernel: [4294861.518000] EFLAGS: 00010282
> (2.6.16.13 #6)
> Sep 27 13:46:28 localhost kernel: [4294861.518000] EIP is at
> _ZN7Element4pushEiP6Packet+0x1d/0x3c [click]
> Sep 27 13:46:28 localhost kernel: [4294861.518000] eax: c6c21a94
> ebx: c6c21a80   ecx: d124f864   edx: 00000000
> Sep 27 13:46:28 localhost kernel: [4294861.518000] esi: cb93ed40
> edi: 00000000   ebp: 00000001   esp: cfd63f70
> Sep 27 13:46:28 localhost kernel: [4294861.518000] ds: 007b   es: 007b
>   ss: 0068
> Sep 27 13:46:28 localhost kernel: [4294861.518000] Process kclick
> (pid: 4057, threadinfo=cfd62000 task=c596f560)
> Sep 27 13:46:28 localhost kernel: [4294861.518000] Stack: <0>cb93ed40
> c6c21980 d12a4f82 c6c21a80 00000000 cb93ed40 cb93ed4c cc20ed80
> Sep 27 13:46:28 localhost kernel: [4294861.518000]        00000001
> 00000080 cf6704c0 0003e504 d12646c9 c6c21980 c6c219ec cf670c5c
> Sep 27 13:46:28 localhost kernel: [4294861.518000]        00000010
> 00000020 d12c3061 00000010 ccbd7e00 c596f560 cf6704c0 cfd62000
> Sep 27 13:46:28 localhost kernel: [4294861.518000] Call Trace:
> Sep 27 13:46:28 localhost kernel: [4294861.518000]
> [pg0+268230530/1053979648]
> _ZN14InfiniteSource8run_taskEP4Task+0xb6/0x12c [click]
> Sep 27 13:46:28 localhost kernel: [4294861.518000]
> [pg0+267966153/1053979648] _ZN12RouterThread6driverEv+0x12d/0x2a0
> [click]
> Sep 27 13:46:28 localhost kernel: [4294861.518000]
> [pg0+268353633/1053979648] _ZN6VectorIiE7reserveEi+0x2d/0x8c [click]
> Sep 27 13:46:28 localhost kernel: [4294861.518000]
> [pg0+268314274/1053979648] _Z11click_schedPv+0x8e/0x164 [click]
> Sep 27 13:46:28 localhost kernel: [4294861.518000]
> [pg0+268314132/1053979648] _Z11click_schedPv+0x0/0x164 [click]
> Sep 27 13:46:28 localhost kernel: [4294861.518000]
> [kernel_thread_helper+5/12] kernel_thread_helper+0x5/0xc
> Sep 27 13:46:28 localhost kernel: [4294861.518000] Code: c0 5b c3 8d
> 76 00 b8 ff ff ff ff 5b c3 90 56 53 8b 5c 24 0c 8b 03 ff 74 24 14 53
> ff 50 10 89 c6 58 5a 85 f6 74 20 8b 43 08 8b 10 <8b> 0a 89 74 24 14 8b
> 40 04 89 44 24 10 89 54 24 0c 8b 49 08 5b
> Sep 27 13:46:33 localhost kernel: [4294861.518000]  <1>click: current
> router threads refuse to die!
> Sep 27 13:46:33 localhost kernel: [4294866.502000] click: Following
> threads still active, expect a crash:
>
>
>
> On 9/27/06, Eddie Kohler <kohler at cs.ucla.edu> wrote:
> > Another question: Can you make the oops happen in a configuration without
> > FromHost?
> >
> > FromHost installs a new networking device in the kernel.  When FromHost is
> > cleaned up, this networking device is unregistered.  It looks like Linux wants
> > to schedule() during the process of unregistering the network device.  Click
> > does not want Linux to schedule().  This is the "scheduling while atomic" message.
> >
> > The thing that's weird is that ToDevice should already have been removed from
> > the scheduling list, even before the "scheudling while atomic" message.
> >
> > Eddie
> >
> >
> > Vivek raghunathan wrote:
> > > All,
> > >
> > > The bug I reported is not specific to my code, and is probably a
> > > ToDevice race condition that I was inadvertently triggering. Using the
> > > following configuration generates the same kernel oops with EIP at
> > > ToDevice::run_task() in interrupt context. (My Ethernet NIC is a Intel
> > > Pro/100 using the e100 driver.)
> > >
> > >   AddressInfo(MyEther 00:11:25:2D:7D:33, RemoteEther 00:11:25:47:EA:7B,
> > >             MyIP 10.1.1.2/8, RemoteIP 10.1.1.1/8, BroadcastAddr 10.255.255.255);
> > >
> > >   FromHost(fak0, MyIP, ETHER MyEther) -> q1::Queue
> > >   q1 -> [0]prio::PrioSched -> Print(test_tx, 100) -> ToDevice(eth0);
> > >   FromDevice(eth0) -> Print(test_rx, 100) -> SetPacketType(HOST) ->
> > > ToHost(fak0);
> > >
> > >   splsrc::InfiniteSource(
> > >   DATA \<aa bb cc dd ee ff>, LIMIT -1, STOP true);
> > >   splsrc -> ipenc::IPEncap(222, MyIP, BroadcastAddr);
> > >   ipenc -> ethenc::EtherEncap(0x0800, ff:ff:ff:ff:ff:ff, MyEther);
> > >   ethenc -> q2::Queue;
> > >   q2 -> [1]prio;
> > >
> > > -Vivek
> > >
> > >
> > >
> > > On 9/19/06, Vivek raghunathan <vivek.raghunathan at gmail.com> wrote:
> > >> Hi all.
> > >>
> > >> I am currently implementing a Click-based opportunistic packet
> > >> combination engine for use on top of IEEE 802.11. I've unit tested my
> > >> implementation fairly extensively in user-space, and partly
> > >> unit-tested in kernelspace, and haven't had any issues so far. I
> > >> recently moved to doing integration testing, and the code seems to run
> > >> okay in-kernel without any problems, except that every so often (maybe
> > >> 6 out of 10 times), click-uninstall causes a kernel panic in interrupt
> > >> context on cleanup.
> > >>
> > >> The panic seems to be related to my code on the tx output path; since
> > >> it only appears for a few particular configurations, and only when
> > >> some of my elements are introducted. The configuration I am using that
> > >> triggers the panic is attached.  I've also manually copied the
> > >> oops-trace from the screen, and attached it with this email. A
> > >> register dump using sysrq does not produce any additional useful info,
> > >> so I have excluded it. It seems like the panic is triggered somewhere
> > >> in ToDevice::run_task. I realize that some brain-dead bug in my code
> > >> is probably at fault, and am currently double-checking everything I've
> > >> written. I am posting here mainly because I am not sure if this is a
> > >> ToDevice bug that I am inadvertently triggering.
> > >>
> > >> Additionally, I'm having trouble getting ksymoops to run with click.
> > >> Any ideas on how I go about it? (I've also tried using kexec/kdump,
> > >> but it seems like these are very twitchy about what kernel config is
> > >> used, and have issues with the one I am using).
> > >>
> > >> Vivek
> > >>
> > >>
> > >> --
> > >>
> > >> ---
> > >>
> > >> *************************************
> > >> Vivek Raghunathan,
> > >> PhD student,
> > >> University of Illinois, Urbana-Champaign
> > >>
> > >> Contact Details:
> > >> 1012 W. Clark St #31,
> > >> Urbana IL 61801
> > >>
> > >> ph: 217-766-1868 (cell)
> > >>     217-333-7541 (off)
> > >>
> > >>
> > >>
> > >
> > >
> >
>
>
> --
>
> ---
>
> *************************************
> Vivek Raghunathan,
> PhD student,
> University of Illinois, Urbana-Champaign
>
> Contact Details:
> 1012 W. Clark St #31,
> Urbana IL 61801
>
> ph: 217-766-1868 (cell)
>     217-333-7541 (off)
>


-- 

---

*************************************
Vivek Raghunathan,
PhD student,
University of Illinois, Urbana-Champaign

Contact Details:
1012 W. Clark St #31,
Urbana IL 61801

ph: 217-766-1868 (cell)
    217-333-7541 (off)


More information about the click mailing list