[Click] click on 2.6 kernel stability
Paine, Thomas Asa
PAINETA at uwec.edu
Wed Mar 22 10:31:13 EST 2006
Beyers,
I'm using the latest source from the CVS (within 2 months). I
am using the 5x driver and polling. All the hardware I've used thus far
with click has been on Dell. I do have an appliance coming this week.
If that works I will post my results to the list as well.
What I was seeing with the other cards I mentioned was that if I
initially slammed a card with, say 50Kpps, out of the gate the nic would
freak out and basically stop servicing packets. If I ramped up the
packet rate over a few seconds, it tended to work (just not trusted, and
that's the worst feeling in production).
Are you only seeing the problem at certain packet rates or data
rates, or when the card isn't getting enough CPU time (is there loss),
or anything like that? I guess what I'm asking is, can it be a
controlled failure? What kind of switching hardware is in place here?
Switchport settings perhaps? Flow control, duplex, etc... Just food
for thought.
Thanks,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Thomas Paine (paineta at uwec.edu)
University of Wisconsin - Eau Claire
garbage foo(garbage g){return(g);}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________
From: Beyers Cronje [mailto:bcronje at gmail.com]
Sent: Wednesday, March 22, 2006 8:10 AM
To: Paine, Thomas Asa
Cc: Click
Subject: Re: [Click] click on 2.6 kernel stability
Hi Thomas,
Thanks for the reply. I am running a pure Intel 82545GM card connected
to a 100Mb switch. I used this same card on my old MB running 2.4.26 in
polling mode with the same e1000-5 click driver with no problems.
Unfortunately I had to replace my MB and the new SIS661 chipset is not
supported on the 2.4 kernel.
What version or date of Click source are you using? Are you running the
E1000-5x driver?
I've come across one possible bug in the e1000-5x driver, in the event
of a TX timeout the driver's tx timeout routine is called where
interrupts are enabled again, even though click polling is still
enabled/active. But I'm struggling to find out why the tx timeout
happens in the first place.
Using FromDevice works well though, so I'm looking into the polling side
of things for now.
Thanks
Beyers
On 3/22/06, Paine, Thomas Asa <PAINETA at uwec.edu> wrote:
Beyers,
I'm running production boxes on 2.6.13.2, patched, with
no
problem (I have run over 500Kpps though them). I can tell you
I've seen
this kind of problem when I attempt to use a "so called" e1000
card.
Whenever I attempted to use a non-intel(OEM) branded Intel 1000
that
kind of behavior is almost guaranteed at even moderate packet
rates. I
have had NO issues like that when running true Intel cards,
specifically
I have used 82543 and 82546 chip based cards.
One thing I have not done, however, is linked at less
than 1Gb
with these cards, and I see you were connected at 100Mb. I'm
not sure
if that could introduce any issues. I would suspect not though.
Thanks,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Thomas Paine (paineta at uwec.edu)
University of Wisconsin - Eau Claire
garbage foo(garbage g){return(g);}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----Original Message-----
From: click-bounces at pdos.csail.mit.edu
[mailto: click-bounces at pdos.csail.mit.edu
<mailto:click-bounces at pdos.csail.mit.edu> ] On Behalf Of Beyers Cronje
Sent: Tuesday, March 21, 2006 7:46 PM
To: Click
Subject: [Click] click on 2.6 kernel stability
Hi everyone,
Is anyone running a stable click kernel implementation on a 2.6
kernel?
Using current cvs code with e1000-5.x polling driver I managed
to
compile and run on 2.6.13.2 but the system is very unstable. I'm
running
a basic config for testing:
PollDevice(eth0) -> ToHost;
Idle -> ToDevice(eth0);
Input and output seems to hang every now and again with the odd
complete
system hang. The only error messages I get are loads of the
following:
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog_1: NIC Link is Up 100 Mbps Full
Duplex
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog_1: NIC Link is Up 100 Mbps Full
Duplex
This only occurs when click module is installed, when I unload
click
module everything works fine. ethtool indicates the link is
always up.
Watchdog never actually reports that the link ever went down, so
could
this indicate an irq conflict or race condition of some sort?
Any ideas
on where to begin troubleshooting this?
Thanks
Beyers
_______________________________________________
click mailing list
click at amsterdam.lcs.mit.edu
https://amsterdam.lcs.mit.edu/mailman/listinfo/click
More information about the click
mailing list