[Click] tx timeout fix

Srivas Chennu chennu at hhi.fhg.de
Wed Sep 6 11:07:52 EDT 2006


Hello Adam,

On combining your patch for Max's 6.1.16.2.DB driver with the changes in
the Click code as suggested by Jason Park, I'm seeing relatively
acceptable performance with respect to the driver timeout problem. The
test machines I used run a click-patched 2.6.16.13 SMP kernel, and are
equipped with a combination of an onboard 82547GI Gigabit Ethernet
Controller and an add-on network card based on the 82546GB Gigabit
Ethernet Controller. For my tests, I setup a 100% bidirectional
utilization of both the interfaces, involving a non-trivial click router
configuration, Though I did encounter timeout errors like the ones
below, there were relatively infrequent and were quickly recovered from.

NETDEV WATCHDOG: eth0: transmit timed out

OR

eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
TDH                  <5>
TDT                  <5>
next_to_use          <5>
next_to_clean        <1>
buffer_info[next_to_clean]
dma                  <1547f040>
time_stamp           <4a1be6>
next_to_watch        <0>
jiffies              <4a20af>
next_to_watch.status <0>


I'm not sure whether there is a relevant difference between these
errors. Perhaps someone could throw some light on that? Also, I'm
guessing your fix will get into any new polling-enabled e1000 driver
that is added to the Click distribution?

Thanks in advance,
Srivas.

On Sep 06, 2006 02:32 PM, Beyers Cronje wrote:

>Adam,
>
>Let me know when the new driver is released and I'll have a crack at
>porting
>Max's polling driver.
>
>Beyers
>
>On 9/6/06, Adam Greenhalgh wrote:
>>
>>Another thing that would be useful to know is what chipset the e1000
>>that has problems is using since it is almost a weekly occurance that
>>a TX hang for one card or another gets reported to the netdev and
>>e1000 lists. From these lists it would seem that many of the hangs
>>have been fixed by the folks at intel, a new release of the 7 series
>>driver is occuring soon, so perhaps it is time to upgrade the click
>>driver.
>>
>>Adam
>>
>>On 9/6/06, Jason Park wrote:
>>>AS NOTED BEFORE, I suggest him to turn off packet split function not
>>>skb_copy.
>>>skb_copy is my environment and I mentioned it as referencing.
>>>
>>>Jason.
>>>-----Original Message-----
>>>From: todd lewis [mailto:tgl2 at yahoo.com]
>>>Sent: Wednesday, September 06, 2006 12:37 AM
>>>To: Jason Park; 'Srivas Chennu'
>>>Cc: click at pdos.csail.mit.edu
>>>Subject: Re: [Click] tx timeout fix
>>>
>>>As noted before, replacing skb_clone with skb_copy amounts to fixing
>>>a
>>>broken door by burning the
>>>house down.
>>>
>>>Does anyone have success under real bidirectional load without
>>>copying
>>every
>>>packet? If so, with
>>>what configuration?
>>>
>>>--- Jason Park wrote:
>>>
>>>>Hi Srivas
>>>>
>>>>What e1000 device are you using?
>>>>If your device support PACKET_SPLIT and you turned on, you should
>>>>turn
>>off
>>>>it.
>>>>Make sure un-define CONFIG_E1000_PACKET_SPLIT on 6.1.16.62.DB
>>>>driver.
>>>>It works well for me with disable packet split and replacing
>>>>skb_clone
>>to
>>>>skb_copy.
>>>>
>>>>Jason
>>>>-----Original Message-----
>>>>From: click-bounces at pdos.csail.mit.edu
>>>>[mailto:click-bounces at pdos.csail.mit.edu] On Behalf Of Srivas Chennu
>>>>Sent: Tuesday, September 05, 2006 5:37 PM
>>>>To: Click
>>>>Subject: Re: [Click] tx timeout fix
>>>>
>>>>Hello Adam and Beyers,
>>>>
>>>>I've been lately testing the timeout fixes you had posted on a
>>>>click-patched 2.6.16.13 kernel. The results therefrom still seem to
>>show
>>>>stability problems. Though the timeouts don't occur predictably or
>>>>as
>>>>often as before, I still encounter them and kernel panics randomly,
>>and
>>>>with a higher probability when testing with high loads. Notably, I
>>>>see
>>>>these problems well pronounced with bidirectional (full-duplex)
>>>>operation of the driver, both with FromDevice and PollDevice. I've
>>>>tested the 5.x driver patch as well as the 6.1.16.62.DB version
>>>>patch,
>>>>and seen similar results.
>>>>
>>>>Please do let me know of the details of a stable 2.6.x configuration
>>>>that you were able to set up using these patches. Further, any idea
>>>>if
>>>>and when the fixes will eventually get in to the main source tree of
>>the
>>>>e1000 driver on sourceforge?
>>>>
>>>>Thanks a bunch in advance,
>>>>Srivas.
>>>>
>>>>On Jul 26, 2006 09:06 PM, Adam Greenhalgh wrote:
>>>>
>>>>>hi
>>>>>
>>>>>beyers and i have been hacking and we have fixed the tx timeout
>>>>>bug.
>>>>>basically the time_stamp is not being set in the buffer and when
>>linux
>>>>>sends packets too, it encounters a buffer with a time stamp of 0
>>>>>and
>>>>>throws an erorr. Attached are two patches, one against cvs ,
>>>>>driver-5.x-e1000_main.patch , and one against Max's 6.1.16.2.DB
>>>>>driver, driver-6.1.16.2.DB-e1000_main.patch . Neither patch has
>>>>>been
>>>>>very heavily tested yet, but neither does anything special.
>>>>>
>>>>>enjoy
>>>>>
>>>>>adam
>>>>
>>>>--
>>>>Visit us at
>>>>IFA Berlin, 01.-06. September 2006
>>>>and
>>>>IBC Amsterdam, NL, 08.-12.September 2006
>>>>_______________________________________________
>>>>click mailing list
>>>>click at amsterdam.lcs.mit.edu
>>>>https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>
>>>>_______________________________________________
>>>>click mailing list
>>>>click at amsterdam.lcs.mit.edu
>>>>https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>
>>>
>>>
>>>__________________________________________________
>>>Do You Yahoo!?
>>>Tired of spam? Yahoo! Mail has the best spam protection around
>>>http://mail.yahoo.com
>>>
>>>_______________________________________________
>>>click mailing list
>>>click at amsterdam.lcs.mit.edu
>>>https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>
>>_______________________________________________
>>click mailing list
>>click at amsterdam.lcs.mit.edu
>>https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>
>_______________________________________________
>click mailing list
>click at amsterdam.lcs.mit.edu
>https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>

--
Visit us at
IFA  Berlin, 01.-06. September 2006
and
IBC Amsterdam, NL, 08.-12.September 2006


More information about the click mailing list