[Click] Causing Kernel Panics Via Click

dmoore7@nd.edu dmoore7 at nd.edu
Fri Sep 14 12:03:25 EDT 2007


There we no other lines, verbatim copy-pasted from the session:

[root at filius ~]# click-install ./Filius-1-50.click
Segmentation fault
[root at filius ~]#
Message from syslogd at filius at Fri Sep 14 10:39:03 2007 ...
filius kernel: Oops: 0002 [#1]

Message from syslogd at filius at Fri Sep 14 10:39:03 2007 ...
filius kernel: SMP

Message from syslogd at filius at Fri Sep 14 10:39:03 2007 ...
filius kernel: CPU:    3
...

I will run it with the '--enable-kassert' option after I reboot and report back
back my results.  Recompiling the kernel will take a little time.
Thanks for you all your help and input folks.
 - David Moore

Quoting Joonwoo Park <joonwpark81 at gmail.com>:

> Resending with correct CC.
>
> 2007/9/15, Joonwoo Park <joonwpark81 at gmail.com>:
> > Ye.
> > Most import line is missing.
> > David, did you configured click with '--enable-kassert'?
> > Dissembler show me segfault occurred around 'lock_tasks()' of
> > Task::strong_reschedule().
> > If you didn't '--enable-kassert', configuring it may can help for the
> > solving problem.
> >
> > Joonwoo Park.
> >
> > 2007/9/15, Eddie Kohler <kohler at cs.ucla.edu>:
> > > There was probably a line just before the "Oops:" line...do you know what
> it was?
> > >
> > >
> > > dmoore7 at nd.edu wrote:
> > > > I grabbed the latest CVS and simply recompiled/installed it without any
> kernel
> > > > modifications, but click crashed (without panicing the kernel).  I
> think this
> > > > means I need to recompile the kernel, although I am unsure if I need to
> > > > download vanilla sources and patch from there, or do it some other way.
> > > >
> > > > I'll probably start fresh from a vanilla kernel after lunch, unless
> someone
> > > > offers advice/admonition otherwise.
> > > > Thanks
> > > >  - David Moore
> > > >
> > > > The crash:
> > > > [root at filius ~]# click-install ./Filius-1-50.click
> > > > Segmentation fault
> > > > Message from syslogd at filius at Fri Sep 14 10:39:03 2007 ... <This line
> preceded
> > > > all further lines, edited out for clarity>
> > > > filius kernel: Oops: 0002 [#1]
> > > > filius kernel: SMP
> > > > filius kernel: CPU:    3
> > > > filius kernel: EIP is at _ZN4Task17strong_rescheduleEv+0x12/0x1a0
> [click]
> > > > filius kernel: eax: f5cced14   ebx: 00000001   ecx: 00000000   edx:
> f7d36000
> > > > filius kernel: esi: f5cced14   edi: f6333000   ebp: 00000000   esp:
> f6333cb8
> > > > filius kernel: ds: 007b   es: 007b   ss: 0068
> > > > filius kernel: Process click-install (pid: 3957, threadinfo=f6333000
> > > > task=f7d36000)
> > > > filius kernel: Stack: <0>f8affbaa 00000001 f5ccc800 f6333cd4 f8b005c7
> f6333cd4
> > > > 00ffffff f5c21a20
> > > > filius kernel:        00000001 00000004 f5ccc800 f8c79ac8 f5ccc800
> 00000001
> > > > c02e2946 f5ccc800
> > > > filius kernel:        00000000 00001002 c028a70d f5ccc800 00001043
> c028b96a
> > > > f6333d2c 00000001
> > > > filius kernel: Call Trace:
> > > > filius kernel:  [<f8affbaa>]
> _ZN8ToDevice13change_deviceEP10net_device+0x2a/0x60
> > > > [click]
> > > > filius kernel:  [<f8b005c7>] device_notifier_hook+0x77/0x90 [click]
> > > > filius kernel:  [<c02e2946>] notifier_call_chain+0x17/0x2e
> > > > filius kernel:  [<c028a70d>] dev_open+0x66/0x6d
> > > > filius kernel:  [<c028b96a>] dev_change_flags+0x48/0xed
> > > > filius kernel:  [<c028c19f>] dev_ioctl+0x309/0x3e2
> > > > filius kernel:  [<f8afc8a2>]
> > > > _Z10dev_updownP10net_deviceiP12ErrorHandler+0x72/0x110 [click]
> > > > filius kernel:  [<f8afc78c>]
> > > > _ZN8FromHost20set_device_addressesEP12ErrorHandler+0xac/0x150 [click]
> > > > filius kernel:  [<f8afca11>]
> _ZN8FromHost10initializeEP12ErrorHandler+0xd1/0xf0
> > > > [click]
> > > > filius kernel:  [<f8aaa5e6>] _Znaj+0x16/0x20 [click]
> > > > filius kernel:  [<f8aca942>]
> _ZN6Router10initializeEP12ErrorHandler+0x682/0x780
> > > > [click]
> > > > filius kernel:  [<c013ec21>] __alloc_pages+0x59/0x273
> > > > filius kernel:  [<f8b20111>]
> > > > _Z12write_configRK6StringP7ElementPvP12ErrorHandler+0x121/0x1e0 [click]
> > > > filius kernel:  [<f8ac456f>]
> > > > _ZNK7Handler10call_writeERK6StringP7ElementbP12ErrorHandler+0x13f/0x210
> [cli
> > > > ck]
> > > > filius kernel:  [<c01533b6>] cache_grow+0x128/0x14a
> > > > filius kernel:  [<f8b23789>] handler_flush+0x499/0x590 [click]
> > > > filius kernel:  [<f8aaa5e6>] _Znaj+0x16/0x20 [click]
> > > > filius kernel:  [<c0155724>] filp_close+0x31/0x52
> > > > filius kernel:  [<c01031ab>] sysenter_past_esp+0x54/0x75
> > > > filius kernel: Code: ff eb d7 c7 04 24 b5 26 b3 f8 e8 0a f4 ff ff e9 60
> ff ff ff
> > > > 90 8d 74 26 00 57 bf 00 f0 ff ff 56 89 c6 53 83 ec 04 21 e7 8b 4e 1c
> <f0> ff 41
> > > > 4c 8d 59 44 8b 47 10 ba 01 00 00 00 39 43 04 74 2 a 8d
> > > >
> > > > Quoting Joonwoo Park <joonwpark81 at gmail.com>:
> > > >
> > > >> Hi David,
> > > >>
> > > >> If you are using linux 2.6.16.13
> > > >> Just doing update your click source to lastest revision from cvs or
> > > >> git may can solve problem. (without update kernel and compile)
> > > >>
> > > >> Joonwoo Park
> > > >>
> > > >> 2007/9/14, dmoore7 at nd.edu <dmoore7 at nd.edu>:
> > > >>> Thanks Eddie,
> > > >>> I will do that.  I assume it will be necesary to repatch and
> recompile the
> > > >>> kernel?
> > > >>> Should I re-download the vanilla sources from kernel.org and patch
> from
> > > >> those,
> > > >>> or is it possible to patch the already patched ones I have now.
> > > >>> Thanks,
> > > >>>  - David
> > > >>>
> > > >>> Quoting Eddie Kohler <kohler at cs.ucla.edu>:
> > > >>>
> > > >>>> It would certainly be helpful if you could upgrade filius to current
> CVS.
> > > >>>> Much spinlock stuff has happened there recently.
> > > >>>>
> > > >>>> E
> > > >>>>
> > > >>>>
> > > >>>> dmoore7 at nd.edu wrote:
> > > >>>>> Hello Everyone,
> > > >>>>> I have encounted a slightly unreliable method of bringing down one
> of
> > > >> my
> > > >>>> click
> > > >>>>> machines.  I shall present an example below of how it happens.
> > > >>>>>
> > > >>>>>> Log into to freshly booted machine
> > > >>>>>> Succesfully load locally stored click file
> > > >>>>>> Ping another machine via the click router
> > > >>>>>> Use w3c's webbot to successfully download a webpage from another
> host
> > > >>>>>> Do so again, again with proper results
> > > >>>>>> Execute click-uninstall
> > > >>>>>> Reload the same kernel config I had loaded earlier
> > > >>>>>> Meanwhile, in another ssh window to the webserver I have tcpdump
> > > >> running,
> > > >>>> and
> > > >>>>> have been watching the traffic coming in and out.
> > > >>>>>
> > > >>>>>> I execute the same webbot command as previously, with the
> following
> > > >> output
> > > >>>>> (copy-pasted from the uninstall onward)
> > > >>>>>
> > > >>>>> [root at filius ~]# click-uninstall [root at filius ~]# click-install
> > > >>>>> ./Filius-1-50.click
> > > >>>>> [root at filius ~]# webbot -I spoof_eth2_0 -q -n -saveimg
> > > >>>>> http://192.168.10.4/index3.html
> > > >>>>> Message from syslogd at filius at Thu Sep 13 12:59:52 2007 ...
> > > >>>>> filius kernel: BUG: spinlock wrong CPU on CPU#2,
> click-uninstall/4890
> > > >>>>>
> > > >>>>> Message from syslogd at filius at Thu Sep 13 12:59:52 2007 ...
> > > >>>>> filius kernel:  lock: f8c79c44, .magic: dead4ead, .owner:
> > > >>>> click-uninstall/4890,
> > > >>>>> .owner_cpu: 3
> > > >>>>>
> > > >>>>> <after about 5 minutes>
> > > >>>>> Read from remote host filius.cse.nd.edu: Connection timed out
> > > >>>>> Connection to filius.cse.nd.edu closed.
> > > >>>>>
> > > >>>>>> During this I have been watching tcpdump on the http server, and
> it is
> > > >>>>> proceeding until suddenly it stops receiving ack's.
> > > >>>>>> Thus it appears the click module began malfunctioning mid-use, and
> not
> > > >>>> simply
> > > >>>>> upon its loading.
> > > >>>>>
> > > >>>>> Background information:
> > > >>>>>  - There are 3 machines with click loaded on them, the config files
> I
> > > >> used
> > > >>>> can
> > > >>>>> be found here:
> > > >>>>> Filius (the one that crashed):
> > > >> http://cse.nd.edu/~dmoore7/Filius-1-50.click
> > > >>>>> Sybill (the forwarder):
> http://cse.nd.edu/~dmoore7/Sybill-1-50.click
> > > >>>>> Hagrid (the http host):
> http://cse.nd.edu/~dmoore7/Hagrid-1-50.click
> > > >>>>>  - For a general idea of what these router configs do, see this
> > > >> diagram:
> > > >>>>> http://cse.nd.edu/~dmoore7/myrouter.jpg
> > > >>>>>  - The webbot I used is a slightly hacked version of w3c's webbot
> > > >> (modified
> > > >>>> to
> > > >>>>> allow forcing of a particular device).  It exhibits no failures
> without
> > > >>>> click.
> > > >>>>>  - Filius is running a version of click downloaded about 3 weeks
> ago
> > > >> from
> > > >>>> the
> > > >>>>> cvs.  The others are running 1.5.0 downloaded from the click
> website
> > > >> during
> > > >>>> the
> > > >>>>> middle of this past summer.  I can update filius if that may be
> > > >> helpful.
> > > >>>>>  - The systems in question have 4 processors, I believe technically
> 2
> > > >> core
> > > >>>> duo's
> > > >>>>> each.
> > > >>>>>  - Kernel is a vanilla 2.6.16.13 from kernel.org w/ click patch
> applied
> > > >>>>>  - This appears to be related to an earlier problem I reported
> here,
> > > >> which
> > > >>>> at
> > > >>>>> first appeared to be resolved:
> > > >>>>> https://pdos.csail.mit.edu/pipermail/click/2007-August/006206.html
> > > >>>>>
> > > >>>>> Any input is appreciated, I will try to get crash dumps and such
> when I
> > > >>>> regain
> > > >>>>> access to the machine, as it is not local I must wait for someone
> else
> > > >> to
> > > >>>>> reboot it for me.
> > > >>>>> Thanks,
> > > >>>>>  - David
> > > >>>>>
> > > >>>>>
> > > >>>>> _______________________________________________
> > > >>>>> click mailing list
> > > >>>>> click at amsterdam.lcs.mit.edu
> > > >>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> _______________________________________________
> > > >>> click mailing list
> > > >>> click at amsterdam.lcs.mit.edu
> > > >>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
> > > >>>
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>







More information about the click mailing list