[Click] click on 2.6 kernel stability
Paine, Thomas Asa
PAINETA at uwec.edu
Wed Mar 22 10:59:42 EST 2006
The only other thing I could think of coming into play could be the
descriptors or other paramters. If this is worth anything, here are the
parameters I use to load the nics kernel module, etc... (if you only
have one nic then you would use just # not #,#)
/sbin/modprobe e1000 FlowControl=0,0 RxIntDelay=256,256
TxIntDelay=256,256 RxDescriptors=256,256 TxDescriptors=256,256
/sbin/ifconfig eth1 up promisc txqueuelen 1000
/sbin/ifconfig eth2 up promisc txqueuelen 1000
Thanks,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Thomas Paine (paineta at uwec.edu)
University of Wisconsin - Eau Claire
garbage foo(garbage g){return(g);}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________
From: Beyers Cronje [mailto:bcronje at gmail.com]
Sent: Wednesday, March 22, 2006 9:43 AM
To: Paine, Thomas Asa
Cc: Click
Subject: Re: [Click] click on 2.6 kernel stability
Hi Thomas,
It's very easy to duplicate. It happens at very low packet rates less
than 100pps. Basically on the Click box I have one console session
pinging the gateway, and a couple of Firefox http sessions. First thing
I notice is that I get "ping sendmsg: No buffer space available", which
to me indicates that packets are not sent from the socket transmit
buffer, to confirm this soon after this I receive the "transmit timed
out" message from the kernel on Eth0.
I dont think the problem is related to the switch or link as when I use
FromDevice instead of PollDevice all works 100%.
Will keep you posted if I pick up anything else.
Beyers
On 3/22/06, Paine, Thomas Asa <PAINETA at uwec.edu> wrote:
Beyers,
I'm using the latest source from the CVS (within 2
months). I
am using the 5x driver and polling. All the hardware I've used
thus far
with click has been on Dell. I do have an appliance coming this
week.
If that works I will post my results to the list as well.
What I was seeing with the other cards I mentioned was
that if I
initially slammed a card with, say 50Kpps, out of the gate the
nic would
freak out and basically stop servicing packets. If I ramped up
the
packet rate over a few seconds, it tended to work (just not
trusted, and
that's the worst feeling in production).
Are you only seeing the problem at certain packet rates
or data
rates, or when the card isn't getting enough CPU time (is there
loss),
or anything like that? I guess what I'm asking is, can it be a
controlled failure? What kind of switching hardware is in place
here?
Switchport settings perhaps? Flow control, duplex, etc... Just
food
for thought.
Thanks,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Thomas Paine (paineta at uwec.edu)
University of Wisconsin - Eau Claire
garbage foo(garbage g){return(g);}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________
From: Beyers Cronje [mailto:bcronje at gmail.com]
Sent: Wednesday, March 22, 2006 8:10 AM
To: Paine, Thomas Asa
Cc: Click
Subject: Re: [Click] click on 2.6 kernel stability
Hi Thomas,
Thanks for the reply. I am running a pure Intel 82545GM card
connected
to a 100Mb switch. I used this same card on my old MB running
2.4.26 in
polling mode with the same e1000-5 click driver with no
problems.
Unfortunately I had to replace my MB and the new SIS661 chipset
is not
supported on the 2.4 kernel.
What version or date of Click source are you using? Are you
running the
E1000-5x driver?
I've come across one possible bug in the e1000-5x driver, in the
event
of a TX timeout the driver's tx timeout routine is called where
interrupts are enabled again, even though click polling is still
enabled/active. But I'm struggling to find out why the tx
timeout
happens in the first place.
Using FromDevice works well though, so I'm looking into the
polling side
of things for now.
Thanks
Beyers
On 3/22/06, Paine, Thomas Asa <PAINETA at uwec.edu> wrote:
Beyers,
I'm running production boxes on 2.6.13.2,
patched, with
no
problem (I have run over 500Kpps though them). I can
tell you
I've seen
this kind of problem when I attempt to use a "so called"
e1000
card.
Whenever I attempted to use a non-intel(OEM) branded
Intel 1000
that
kind of behavior is almost guaranteed at even moderate
packet
rates. I
have had NO issues like that when running true Intel
cards,
specifically
I have used 82543 and 82546 chip based cards.
One thing I have not done, however, is linked at
less
than 1Gb
with these cards, and I see you were connected at 100Mb.
I'm
not sure
if that could introduce any issues. I would suspect not
though.
Thanks,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Thomas Paine (paineta at uwec.edu)
University of Wisconsin - Eau Claire
garbage foo(garbage g){return(g);}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----Original Message-----
From: click-bounces at pdos.csail.mit.edu
<mailto:click-bounces at pdos.csail.mit.edu>
[mailto: click-bounces at pdos.csail.mit.edu
<mailto:click-bounces at pdos.csail.mit.edu> ] On Behalf Of Beyers
Cronje
Sent: Tuesday, March 21, 2006 7:46 PM
To: Click
Subject: [Click] click on 2.6 kernel stability
Hi everyone,
Is anyone running a stable click kernel implementation
on a 2.6
kernel?
Using current cvs code with e1000-5.x polling driver I
managed
to
compile and run on 2.6.13.2 but the system is very
unstable. I'm
running
a basic config for testing:
PollDevice(eth0) -> ToHost;
Idle -> ToDevice(eth0);
Input and output seems to hang every now and again with
the odd
complete
system hang. The only error messages I get are loads of
the
following:
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog_1: NIC Link is Up 100 Mbps
Full
Duplex
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog_1: NIC Link is Up 100 Mbps
Full
Duplex
This only occurs when click module is installed, when I
unload
click
module everything works fine. ethtool indicates the link
is
always up.
Watchdog never actually reports that the link ever went
down, so
could
this indicate an irq conflict or race condition of some
sort?
Any ideas
on where to begin troubleshooting this?
Thanks
Beyers
_______________________________________________
click mailing list
click at amsterdam.lcs.mit.edu
https://amsterdam.lcs.mit.edu/mailman/listinfo/click
<https://amsterdam.lcs.mit.edu/mailman/listinfo/click>
More information about the click
mailing list