[Click] tx timeout fix

Eddie Kohler kohler at cs.ucla.edu
Fri Sep 22 20:57:27 EDT 2006


Yay!

Thanks for the note; the driver is updated.

Eddie


Srivas Chennu wrote:
> Hello all,
> 
> I can confirm that with the latest click CVS sources I am able to set up
> a click configuration with 100% utilized bidirectional links. Using
> PollDevice with the e1000-6.x driver for the 82547GI and 82546GB
> controllers, I don't see the timeout errors anymore, and my router
> appears to be quite stable.
> 
> One point though: The e1000_main.c file in the new e1000-6.x driver
> doesn't seem to contain the Click kernel patch for the file. I had to
> manually change 'netif_receive_skb(skb)' to 'netif_receive_skb(skb,
> skb->protocol, 0)' near lines 3662 and 3802, in order to get the driver
> to compile. Perhaps this change could be made in the checked in version
> of the file?
> 
> Many thanks and regards,
> Srivas.
> 
> On Sep 19, 2006 04:33 AM, Jason Park wrote:
> 
>> Dear Eddie.
>>
>> It's my pleasure.
>> Yes, skb_copy might not be needed anymore.
>> After patching, I have been testing it for some days with skb_clone and
>> it's
>> working nicely.
>> Before patching, the test has not worked without replacing skb_clone to
>> skb_copy.
>>
>> Jason.
>>
>> -----Original Message-----
>> From: Eddie Kohler [mailto:kohler at cs.ucla.edu]
>> Sent: Tuesday, September 19, 2006 11:04 AM
>> To: Jason Park
>> Cc: 'Adam Greenhalgh'; click at amsterdam.lcs.mit.edu
>> Subject: Re: [Click] tx timeout fix
>>
>> Jason,
>>
>> Thanks very much for this patch! This looks good. I've applied a
>> version
>> of
>> it to our code (and produced a new Linux patch to include your patch to
>> skbuff.c). Adam/Beyers, does this take care of the remaining timeouts?
>> Jason, does this patch mean that you no longer need to use skb_copy()
>> instead
>> of skb_clone()?
>>
>> Eddie
>>
>>
>> Jason Park wrote:
>>> Um.
>>> Sorry for I missed something.
>>> Here I am re-attaching patch files.
>>> packet.cc.patch was modified trivially and fixed skb_recycle in
>>> skbuff.c
>> for
>>> PollDevice.
>>>
>>> Jason.
>>> -----Original Message-----
>>> From: click-bounces at pdos.csail.mit.edu
>>> [mailto:click-bounces at pdos.csail.mit.edu] On Behalf Of Jason Park
>>> Sent: Friday, September 15, 2006 9:10 PM
>>> To: 'Adam Greenhalgh'
>>> Cc: click at pdos.csail.mit.edu
>>> Subject: Re: [Click] tx timeout fix
>>>
>>> Dear click guys.
>>>
>>> I hacked the click for TX timeout and found something.
>>> For now lib/packet.cc expensive_uniqueify() function does not
>>> initialize
>> all
>>> of skb_shinfo about tcp segment offloading.
>>> As a result, It made TSO supported e1000 device to TX timeout.
>>> (hw.mac_type >= e1000_82544 && hw.mac_type != 82547)
>>> I think it caused a confusing for click users about stability of click
>>> on
>>> e1000.
>>> Here I attached patch. If someone test, please let me know your test
>> result.
>>> (It should work with skb_clone() not replaced skb_copy())
>>>
>>> Thanks in advance.
>>>
>>> Jason.
>>> -----Original Message-----
>>> From: adam.greenhalgh at gmail.com [mailto:adam.greenhalgh at gmail.com] On
>> Behalf
>>> Of Adam Greenhalgh
>>> Sent: Wednesday, September 06, 2006 5:12 PM
>>> To: Jason Park
>>> Cc: todd lewis; click at pdos.csail.mit.edu
>>> Subject: Re: [Click] tx timeout fix
>>>
>>> Another thing that would be useful to know is what chipset the e1000
>>> that has problems is using since it is almost a weekly occurance that
>>> a TX hang for one card or another gets reported to the netdev and
>>> e1000 lists. From these lists it would seem that many of the hangs
>>> have been fixed by the folks at intel, a new release of the 7 series
>>> driver is occuring soon, so perhaps it is time to upgrade the click
>>> driver.
>>>
>>> Adam
>>>
>>> On 9/6/06, Jason Park wrote:
>>>> AS NOTED BEFORE, I suggest him to turn off packet split function not
>>>> skb_copy.
>>>> skb_copy is my environment and I mentioned it as referencing.
>>>>
>>>> Jason.
>>>> -----Original Message-----
>>>> From: todd lewis [mailto:tgl2 at yahoo.com]
>>>> Sent: Wednesday, September 06, 2006 12:37 AM
>>>> To: Jason Park; 'Srivas Chennu'
>>>> Cc: click at pdos.csail.mit.edu
>>>> Subject: Re: [Click] tx timeout fix
>>>>
>>>> As noted before, replacing skb_clone with skb_copy amounts to fixing
>>>> a
>>>> broken door by burning the
>>>> house down.
>>>>
>>>> Does anyone have success under real bidirectional load without
>>>> copying
>>> every
>>>> packet? If so, with
>>>> what configuration?
>>>>
>>>> --- Jason Park wrote:
>>>>
>>>>> Hi Srivas
>>>>>
>>>>> What e1000 device are you using?
>>>>> If your device support PACKET_SPLIT and you turned on, you should
>>>>> turn
>>> off
>>>>> it.
>>>>> Make sure un-define CONFIG_E1000_PACKET_SPLIT on 6.1.16.62.DB
>>>>> driver.
>>>>> It works well for me with disable packet split and replacing
>>>>> skb_clone
>>> to
>>>>> skb_copy.
>>>>>
>>>>> Jason
>>>>> -----Original Message-----
>>>>> From: click-bounces at pdos.csail.mit.edu
>>>>> [mailto:click-bounces at pdos.csail.mit.edu] On Behalf Of Srivas Chennu
>>>>> Sent: Tuesday, September 05, 2006 5:37 PM
>>>>> To: Click
>>>>> Subject: Re: [Click] tx timeout fix
>>>>>
>>>>> Hello Adam and Beyers,
>>>>>
>>>>> I've been lately testing the timeout fixes you had posted on a
>>>>> click-patched 2.6.16.13 kernel. The results therefrom still seem to
>>>>> show
>>>>> stability problems. Though the timeouts don't occur predictably or
>>>>> as
>>>>> often as before, I still encounter them and kernel panics randomly,
>>>>> and
>>>>> with a higher probability when testing with high loads. Notably, I
>>>>> see
>>>>> these problems well pronounced with bidirectional (full-duplex)
>>>>> operation of the driver, both with FromDevice and PollDevice. I've
>>>>> tested the 5.x driver patch as well as the 6.1.16.62.DB version
>>>>> patch,
>>>>> and seen similar results.
>>>>>
>>>>> Please do let me know of the details of a stable 2.6.x configuration
>>>>> that you were able to set up using these patches. Further, any idea
>>>>> if
>>>>> and when the fixes will eventually get in to the main source tree of
>>>>> the
>>>>> e1000 driver on sourceforge?
>>>>>
>>>>> Thanks a bunch in advance,
>>>>> Srivas.
>>>>>
>>>>> On Jul 26, 2006 09:06 PM, Adam Greenhalgh wrote:
>>>>>
>>>>>> hi
>>>>>>
>>>>>> beyers and i have been hacking and we have fixed the tx timeout
>>>>>> bug.
>>>>>> basically the time_stamp is not being set in the buffer and when
>>>>>> linux
>>>>>> sends packets too, it encounters a buffer with a time stamp of 0
>>>>>> and
>>>>>> throws an erorr. Attached are two patches, one against cvs ,
>>>>>> driver-5.x-e1000_main.patch , and one against Max's 6.1.16.2.DB
>>>>>> driver, driver-6.1.16.2.DB-e1000_main.patch . Neither patch has
>>>>>> been
>>>>>> very heavily tested yet, but neither does anything special.
>>>>>>
>>>>>> enjoy
>>>>>>
>>>>>> adam
>>>>> --
>>>>> Visit us at
>>>>> IFA Berlin, 01.-06. September 2006
>>>>> and
>>>>> IBC Amsterdam, NL, 08.-12.September 2006
>>>>> _______________________________________________
>>>>> click mailing list
>>>>> click at amsterdam.lcs.mit.edu
>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>
>>>>> _______________________________________________
>>>>> click mailing list
>>>>> click at amsterdam.lcs.mit.edu
>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>
>>>> __________________________________________________
>>>> Do You Yahoo!?
>>>> Tired of spam? Yahoo! Mail has the best spam protection around
>>>> http://mail.yahoo.com
>>>>
>>>> _______________________________________________
>>>> click mailing list
>>>> click at amsterdam.lcs.mit.edu
>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>
>>>>
> 
>>>>>>> ------------------------------------------------------------------------
>>>> _______________________________________________
>>>> click mailing list
>>>> click at amsterdam.lcs.mit.edu
>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>> _______________________________________________
>> click mailing list
>> click at amsterdam.lcs.mit.edu
>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>
> 
> --
> Visit us at
> ECOC 2006, Cannes (F), September 25th - 28th, 2006, booth 652 
> _______________________________________________
> click mailing list
> click at amsterdam.lcs.mit.edu
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click


More information about the click mailing list