[Click] Causing Kernel Panics Via Click

dmoore7@nd.edu dmoore7 at nd.edu
Fri Sep 14 09:35:57 EDT 2007


Thanks Eddie,
I will do that.  I assume it will be necesary to repatch and recompile the
kernel?
Should I re-download the vanilla sources from kernel.org and patch from those,
or is it possible to patch the already patched ones I have now.
Thanks,
 - David

Quoting Eddie Kohler <kohler at cs.ucla.edu>:

> It would certainly be helpful if you could upgrade filius to current CVS.
> Much spinlock stuff has happened there recently.
>
> E
>
>
> dmoore7 at nd.edu wrote:
> > Hello Everyone,
> > I have encounted a slightly unreliable method of bringing down one of my
> click
> > machines.  I shall present an example below of how it happens.
> >
> >> Log into to freshly booted machine
> >> Succesfully load locally stored click file
> >> Ping another machine via the click router
> >> Use w3c's webbot to successfully download a webpage from another host
> >> Do so again, again with proper results
> >> Execute click-uninstall
> >> Reload the same kernel config I had loaded earlier
> >
> >> Meanwhile, in another ssh window to the webserver I have tcpdump running,
> and
> > have been watching the traffic coming in and out.
> >
> >> I execute the same webbot command as previously, with the following output
> > (copy-pasted from the uninstall onward)
> >
> > [root at filius ~]# click-uninstall [root at filius ~]# click-install
> > ./Filius-1-50.click
> > [root at filius ~]# webbot -I spoof_eth2_0 -q -n -saveimg
> > http://192.168.10.4/index3.html
> > Message from syslogd at filius at Thu Sep 13 12:59:52 2007 ...
> > filius kernel: BUG: spinlock wrong CPU on CPU#2, click-uninstall/4890
> >
> > Message from syslogd at filius at Thu Sep 13 12:59:52 2007 ...
> > filius kernel:  lock: f8c79c44, .magic: dead4ead, .owner:
> click-uninstall/4890,
> > .owner_cpu: 3
> >
> > <after about 5 minutes>
> > Read from remote host filius.cse.nd.edu: Connection timed out
> > Connection to filius.cse.nd.edu closed.
> >
> >> During this I have been watching tcpdump on the http server, and it is
> > proceeding until suddenly it stops receiving ack's.
> >> Thus it appears the click module began malfunctioning mid-use, and not
> simply
> > upon its loading.
> >
> > Background information:
> >  - There are 3 machines with click loaded on them, the config files I used
> can
> > be found here:
> > Filius (the one that crashed): http://cse.nd.edu/~dmoore7/Filius-1-50.click
> > Sybill (the forwarder): http://cse.nd.edu/~dmoore7/Sybill-1-50.click
> > Hagrid (the http host): http://cse.nd.edu/~dmoore7/Hagrid-1-50.click
> >  - For a general idea of what these router configs do, see this diagram:
> > http://cse.nd.edu/~dmoore7/myrouter.jpg
> >  - The webbot I used is a slightly hacked version of w3c's webbot (modified
> to
> > allow forcing of a particular device).  It exhibits no failures without
> click.
> >  - Filius is running a version of click downloaded about 3 weeks ago from
> the
> > cvs.  The others are running 1.5.0 downloaded from the click website during
> the
> > middle of this past summer.  I can update filius if that may be helpful.
> >  - The systems in question have 4 processors, I believe technically 2 core
> duo's
> > each.
> >  - Kernel is a vanilla 2.6.16.13 from kernel.org w/ click patch applied
> >  - This appears to be related to an earlier problem I reported here, which
> at
> > first appeared to be resolved:
> > https://pdos.csail.mit.edu/pipermail/click/2007-August/006206.html
> >
> > Any input is appreciated, I will try to get crash dumps and such when I
> regain
> > access to the machine, as it is not local I must wait for someone else to
> > reboot it for me.
> > Thanks,
> >  - David
> >
> >
> > _______________________________________________
> > click mailing list
> > click at amsterdam.lcs.mit.edu
> > https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>







More information about the click mailing list