[Click] Help with kernel OOPS

Vivek raghunathan vivek.raghunathan at gmail.com
Tue Oct 3 18:54:02 EDT 2006


Eddie,

It seems like the patch you added doesn't still fix the kernel panics
when CONFIG_PREEMPT is enabled. I'll poke around further in the next
couple of days and keep you posted ...

Vivek


On 9/28/06, Eddie Kohler <kohler at cs.ucla.edu> wrote:
> Hi Vivek,
>
> This is good to hear.  I've updated the INSTALL instructions to advise against
> CONFIG_PREEMPT.  I've also added a small patch that might-- MIGHT-- solve your
> problem if CONFIG_PREEMPT is enabled.  If you need CONFIG_PREEMPT, then we'd
> welcome more bug reports and patches.
>
> Eddie
>
>
> Vivek raghunathan wrote:
> > Eddie,
> >
> > I don't see any panics either with CONFIG_PREEMPT_VOLUNTARY enabled,
> > with and without FromHost. I'll test further and keep you posted if I
> > generate any oops.
> >
> > Thanks for the troubleshooting help.
> >
> > Vivek
> >
> >
> > On 9/27/06, Vivek raghunathan <vivek.raghunathan at gmail.com> wrote:
> >> Eddie,
> >>
> >> To answer your previous questions, Master::kill_router is calling
> >> RouterThread::unschedule_router_tasks.  I'll recompile without
> >> CONFIG_PREEMPT and see if the oops disappear.
> >>
> >> Vivek
> >>
> >>
> >> On 9/27/06, Eddie Kohler <kohler at cs.ucla.edu> wrote:
> >> > Your oops reports PREEMPT.  I assume this means your kernel has
> >> > CONFIG_PREEMPT.  Can you try recompiling your kernel without
> >> CONFIG_PREEMPT?
> >> > My kernel has CONFIG_PREEMPT_VOLUNTARY,  but not CONFIG_PREEMPT.  I
> >> have run
> >> > the following bash script, which is like your config minus ToHost,
> >> with no crash.
> >> >
> >> >
> >> > x="splsrc::InfiniteSource(DATA This_is_a_test_by_Eddie_Kohler, LIMIT
> >> -1, STOP
> >> > true);
> >> > splsrc -> ipenc::IPEncap(222, 131.179.33.137, 131.179.232.51);
> >> > ipenc -> ethenc::EtherEncap(0x0800, ff:ff:ff:ff:ff:ff, ath0);
> >> > ethenc -> q2::Queue;
> >> > q2 -> ToDevice(ath0);"
> >> >
> >> > times=0
> >> > while (($times < 100)); do
> >> >         click-install -e "$x"
> >> >         click-uninstall
> >> >         times=$(($times + 1))
> >> > done
> >> >
> >> >
> >> > Eddie
> >> >
> >> >
> >> > Vivek raghunathan wrote:
> >> > > Eddie,
> >> > >
> >> > > Here's a script without FromHost that generates an oops. This oops
> >> > > doesn't hang the machine though ...
> >> > >
> >> > > AddressInfo(MyEther 00:11:25:2D:7D:33, RemoteEther 00:11:25:47:EA:7B,
> >> > >          MyIP 10.1.1.2/8, RemoteIP 10.1.1.1/8, BroadcastAddr
> >> > > 10.255.255.255);
> >> > >
> >> > > FromDevice(eth0) -> SetPacketType(HOST) -> ToHost(eth0);
> >> > >
> >> > > splsrc::InfiniteSource(
> >> > > DATA \<aa bb cc dd ee ff>, LIMIT -1, STOP true);
> >> > > splsrc -> ipenc::IPEncap(222, MyIP, BroadcastAddr);
> >> > > ipenc -> ethenc::EtherEncap(0x0800, ff:ff:ff:ff:ff:ff, MyEther);
> >> > > ethenc -> q2::Queue;
> >> > > q2 -> ToDevice(eth0);
> >> > >
> >> > > -Vivek
> >> > >
> >> > >
> >> > >
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] Unable to handle
> >> > > kernel NULL pointer dereference at virtual address 00000000
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]  printing eip:
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] d124f881
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] *pde = 00000000
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] Oops: 0000 [#1]
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] PREEMPT
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] Modules linked in:
> >> > > click proclikefs rfcomm l2cap bluetooth nvram uinput ppdev radeon drm
> >> > > speedstep_centrino cpufreq_userspace cpufreq_stats freq_table
> >> > > cpufreq_powersave cpufreq_ondemand cpufreq_conservative video
> >> ibm_acpi
> >> > > container button battery ac ipv6 dm_mod md_mod lp af_packet airo_cs
> >> > > airo pcmcia joydev tsdev e100 ipw2200 mii ide_cd cdrom ieee80211
> >> > > ieee80211_crypt yenta_socket rsrc_nonstatic pcmcia_core snd_intel8x0
> >> > > snd_ac97_codec snd_ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm
> >> > > snd_timer hw_random psmouse snd soundcore parport_pc parport ehci_hcd
> >> > > uhci_hcd shpchp pci_hotplug usbcore serio_raw pcspkr floppy
> >> > > snd_page_alloc rtc intel_agp agpgart evdev ext3 jbd mbcache ide_disk
> >> > > ide_generic via82cxxx trm290 triflex slc90e66 sis5513 siimage
> >> > > serverworks sc1200 rz1000 piix pdc202xx_old pdc202xx_new opti621
> >> > > ns87415 it821x hpt366 hpt34x generic cy82c693 cs5535 cs5530 cs5520
> >> > > cmd64x atiixp amd74xx alim15x3 aec62xx thermal processor fan
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] CPU:    0
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] EIP:
> >> > > 0060:[pg0+267880577/1053979648]    Not tainted VLI
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] EFLAGS: 00010282
> >> > > (2.6.16.13 #6)
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] EIP is at
> >> > > _ZN7Element4pushEiP6Packet+0x1d/0x3c [click]
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] eax: c6c21a94
> >> > > ebx: c6c21a80   ecx: d124f864   edx: 00000000
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] esi: cb93ed40
> >> > > edi: 00000000   ebp: 00000001   esp: cfd63f70
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] ds: 007b   es:
> >> 007b
> >> > >  ss: 0068
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] Process kclick
> >> > > (pid: 4057, threadinfo=cfd62000 task=c596f560)
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] Stack: <0>cb93ed40
> >> > > c6c21980 d12a4f82 c6c21a80 00000000 cb93ed40 cb93ed4c cc20ed80
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]        00000001
> >> > > 00000080 cf6704c0 0003e504 d12646c9 c6c21980 c6c219ec cf670c5c
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]        00000010
> >> > > 00000020 d12c3061 00000010 ccbd7e00 c596f560 cf6704c0 cfd62000
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] Call Trace:
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]
> >> > > [pg0+268230530/1053979648]
> >> > > _ZN14InfiniteSource8run_taskEP4Task+0xb6/0x12c [click]
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]
> >> > > [pg0+267966153/1053979648] _ZN12RouterThread6driverEv+0x12d/0x2a0
> >> > > [click]
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]
> >> > > [pg0+268353633/1053979648] _ZN6VectorIiE7reserveEi+0x2d/0x8c [click]
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]
> >> > > [pg0+268314274/1053979648] _Z11click_schedPv+0x8e/0x164 [click]
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]
> >> > > [pg0+268314132/1053979648] _Z11click_schedPv+0x0/0x164 [click]
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000]
> >> > > [kernel_thread_helper+5/12] kernel_thread_helper+0x5/0xc
> >> > > Sep 27 13:46:28 localhost kernel: [4294861.518000] Code: c0 5b c3 8d
> >> > > 76 00 b8 ff ff ff ff 5b c3 90 56 53 8b 5c 24 0c 8b 03 ff 74 24 14 53
> >> > > ff 50 10 89 c6 58 5a 85 f6 74 20 8b 43 08 8b 10 <8b> 0a 89 74 24
> >> 14 8b
> >> > > 40 04 89 44 24 10 89 54 24 0c 8b 49 08 5b
> >> > > Sep 27 13:46:33 localhost kernel: [4294861.518000]  <1>click: current
> >> > > router threads refuse to die!
> >> > > Sep 27 13:46:33 localhost kernel: [4294866.502000] click: Following
> >> > > threads still active, expect a crash:
> >> > >
> >> > >
> >> > >
> >> > > On 9/27/06, Eddie Kohler <kohler at cs.ucla.edu> wrote:
> >> > >> Another question: Can you make the oops happen in a configuration
> >> without
> >> > >> FromHost?
> >> > >>
> >> > >> FromHost installs a new networking device in the kernel.  When
> >> > >> FromHost is
> >> > >> cleaned up, this networking device is unregistered.  It looks like
> >> > >> Linux wants
> >> > >> to schedule() during the process of unregistering the network
> >> device.
> >> > >> Click
> >> > >> does not want Linux to schedule().  This is the "scheduling while
> >> > >> atomic" message.
> >> > >>
> >> > >> The thing that's weird is that ToDevice should already have been
> >> > >> removed from
> >> > >> the scheduling list, even before the "scheudling while atomic"
> >> message.
> >> > >>
> >> > >> Eddie
> >> > >>
> >> > >>
> >> > >> Vivek raghunathan wrote:
> >> > >> > All,
> >> > >> >
> >> > >> > The bug I reported is not specific to my code, and is probably a
> >> > >> > ToDevice race condition that I was inadvertently triggering.
> >> Using the
> >> > >> > following configuration generates the same kernel oops with EIP at
> >> > >> > ToDevice::run_task() in interrupt context. (My Ethernet NIC is
> >> a Intel
> >> > >> > Pro/100 using the e100 driver.)
> >> > >> >
> >> > >> >   AddressInfo(MyEther 00:11:25:2D:7D:33, RemoteEther
> >> 00:11:25:47:EA:7B,
> >> > >> >             MyIP 10.1.1.2/8, RemoteIP 10.1.1.1/8, BroadcastAddr
> >> > >> 10.255.255.255);
> >> > >> >
> >> > >> >   FromHost(fak0, MyIP, ETHER MyEther) -> q1::Queue
> >> > >> >   q1 -> [0]prio::PrioSched -> Print(test_tx, 100) ->
> >> ToDevice(eth0);
> >> > >> >   FromDevice(eth0) -> Print(test_rx, 100) ->
> >> SetPacketType(HOST) ->
> >> > >> > ToHost(fak0);
> >> > >> >
> >> > >> >   splsrc::InfiniteSource(
> >> > >> >   DATA \<aa bb cc dd ee ff>, LIMIT -1, STOP true);
> >> > >> >   splsrc -> ipenc::IPEncap(222, MyIP, BroadcastAddr);
> >> > >> >   ipenc -> ethenc::EtherEncap(0x0800, ff:ff:ff:ff:ff:ff, MyEther);
> >> > >> >   ethenc -> q2::Queue;
> >> > >> >   q2 -> [1]prio;
> >> > >> >
> >> > >> > -Vivek
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > On 9/19/06, Vivek raghunathan <vivek.raghunathan at gmail.com> wrote:
> >> > >> >> Hi all.
> >> > >> >>
> >> > >> >> I am currently implementing a Click-based opportunistic packet
> >> > >> >> combination engine for use on top of IEEE 802.11. I've unit
> >> tested my
> >> > >> >> implementation fairly extensively in user-space, and partly
> >> > >> >> unit-tested in kernelspace, and haven't had any issues so far. I
> >> > >> >> recently moved to doing integration testing, and the code
> >> seems to run
> >> > >> >> okay in-kernel without any problems, except that every so
> >> often (maybe
> >> > >> >> 6 out of 10 times), click-uninstall causes a kernel panic in
> >> interrupt
> >> > >> >> context on cleanup.
> >> > >> >>
> >> > >> >> The panic seems to be related to my code on the tx output
> >> path; since
> >> > >> >> it only appears for a few particular configurations, and only
> >> when
> >> > >> >> some of my elements are introducted. The configuration I am
> >> using that
> >> > >> >> triggers the panic is attached.  I've also manually copied the
> >> > >> >> oops-trace from the screen, and attached it with this email. A
> >> > >> >> register dump using sysrq does not produce any additional
> >> useful info,
> >> > >> >> so I have excluded it. It seems like the panic is triggered
> >> somewhere
> >> > >> >> in ToDevice::run_task. I realize that some brain-dead bug in
> >> my code
> >> > >> >> is probably at fault, and am currently double-checking
> >> everything I've
> >> > >> >> written. I am posting here mainly because I am not sure if
> >> this is a
> >> > >> >> ToDevice bug that I am inadvertently triggering.
> >> > >> >>
> >> > >> >> Additionally, I'm having trouble getting ksymoops to run with
> >> click.
> >> > >> >> Any ideas on how I go about it? (I've also tried using
> >> kexec/kdump,
> >> > >> >> but it seems like these are very twitchy about what kernel
> >> config is
> >> > >> >> used, and have issues with the one I am using).
> >> > >> >>
> >> > >> >> Vivek
> >> > >> >>
> >> > >> >>
> >> > >> >> --
> >> > >> >>
> >> > >> >> ---
> >> > >> >>
> >> > >> >> *************************************
> >> > >> >> Vivek Raghunathan,
> >> > >> >> PhD student,
> >> > >> >> University of Illinois, Urbana-Champaign
> >> > >> >>
> >> > >> >> Contact Details:
> >> > >> >> 1012 W. Clark St #31,
> >> > >> >> Urbana IL 61801
> >> > >> >>
> >> > >> >> ph: 217-766-1868 (cell)
> >> > >> >>     217-333-7541 (off)
> >> > >> >>
> >> > >> >>
> >> > >> >>
> >> > >> >
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >> >
> >>
> >>
> >> --
> >>
> >> ---
> >>
> >> *************************************
> >> Vivek Raghunathan,
> >> PhD student,
> >> University of Illinois, Urbana-Champaign
> >>
> >> Contact Details:
> >> 1012 W. Clark St #31,
> >> Urbana IL 61801
> >>
> >> ph: 217-766-1868 (cell)
> >>     217-333-7541 (off)
> >>
> >
> >
>


-- 

---

*************************************
Vivek Raghunathan,
PhD student,
University of Illinois, Urbana-Champaign

Contact Details:
1012 W. Clark St #31,
Urbana IL 61801

ph: 217-766-1868 (cell)
    217-333-7541 (off)


More information about the click mailing list