[Click] Packet loss even at low sending rate

Beyers Cronje bcronje at gmail.com
Mon Dec 5 15:04:30 EST 2011


Hi Bingyang,

Personally I would use ThreadSafeQueue as it implements the full and empty
notifiers which CPUQueue does not (if I remember correctly). This should
somewhat help with lower packet rates.

Beyers

On Mon, Dec 5, 2011 at 8:57 PM, Bingyang LIU <bjornliu at gmail.com> wrote:

> Hi Beyes,
>
> Please check the script below, and thanks very much!
>
> // The packet flow in the router is as follows:
> //
> // pd2::PollDevice(eth2) -> ...processing... -> td1::ToDevice(eth1)
> // pd1::PollDevice(eth1) -> ...processing... -> td2::ToDevice(eth2)
> // pd0::PollDevice(eth0) -> ...processing... -> td0::ToDevice(eth0)
> // pd3::PollDevice(eth3) -> ...processing... -> td3::ToDevice(eth3)
> //
> // Note that it is not a standard IP router, and components such as
> DeclTTL are eliminated.
> //
>
> AddressInfo(router1-w1       10.0.1.1        00:15:17:57:bd:c6,
> //eth2
>             router1-w2       10.0.2.1        00:15:17:57:bd:c5,
> //eth1
>             router1-w3       10.0.3.1        00:15:17:57:bd:c4,
> //eth0
>             router1-w4       10.0.4.1        00:15:17:57:bd:c7,
> //eth3
>             user1-w1         10.0.1.2        00:15:17:57:c7:4e,
> //eth2
>             user2-w2         10.0.2.2        00:15:17:57:c4:86,
> //eth2
>             user3-w3         10.0.3.2        00:15:17:57:c6:ca,
> //eth2
>             user4-w4         10.0.4.2        00:15:17:57:c4:3a);
>  //eth2
>
> c1 :: Classifier(12/0806 20/0001,
>                  12/0806 20/0002,
>                  12/0800,
>                  -);
> c2 :: Classifier(12/0806 20/0001,
>                  12/0806 20/0002,
>                  12/0800,
>                  -);
> c3 :: Classifier(12/0806 20/0001,
>                  12/0806 20/0002,
>                  12/0800,
>                  -);
> c4 :: Classifier(12/0806 20/0001,
>                  12/0806 20/0002,
>                  12/0800,
>                  -);
>
> q0 :: Discard; //ToHost;
> q1 :: CPUQueue(10000) -> EtherEncap(0x0800, router1-w1, user1-w1) -> td2
> :: ToDevice(eth2);
> q2 :: CPUQueue(10000) -> EtherEncap(0x0800, router1-w2, user2-w2) -> td1
> :: ToDevice(eth1);
> q3 :: CPUQueue(10000) -> EtherEncap(0x0800, router1-w3, user3-w3) -> td0
> :: ToDevice(eth0);
> q4 :: CPUQueue(10000) -> EtherEncap(0x0800, router1-w4, user4-w4) -> td3
> :: ToDevice(eth3);
>
> rt :: LookupIPRouteMP(10.0.1.0/32 0, 10.0.1.1/32 0, 10.0.1.255/32 0,
>                       10.0.2.0/32 0, 10.0.2.1/32 0, 10.0.2.255/32 0,
>                       10.0.3.0/32 0, 10.0.3.1/32 0, 10.0.3.255/32 0,
>                       10.0.4.0/32 0, 10.0.4.1/32 0, 10.0.4.255/32 0,
>                       10.0.1.0/24 1, 10.0.2.0/24 2, 10.0.3.0/24 3,
>                       10.0.4.0/24 4, 0.0.0.0/0 0);
> rt[0] -> Discard;
> rt[1] -> q1;
> rt[2] -> k4::Counter -> q2;
> rt[3] -> q3;
> rt[4] -> q4;
>
> pd2 :: PollDevice(eth2) -> c1;
> c1[0] -> q0;
> c1[1] -> q0;
> c1[2] -> Strip(14) -> CheckIPHeader() -> rt;
> c1[3] -> Discard;
> pd1 :: PollDevice(eth1) -> c2;
> c2[0] -> q0;
> c2[1] -> q0;
> c2[2] -> Strip(14) -> CheckIPHeader() -> rt;
> c2[3] -> Discard;
> pd0 :: PollDevice(eth0) -> c3;
> c3[0] -> q0;
> c3[1] -> q0;
> c3[2] -> Strip(14) -> CheckIPHeader() -> rt;
> c3[3] -> Discard;
> pd3 :: PollDevice(eth3) -> c4;
> c4[0] -> q0;
> c4[1] -> q0;
> c4[2] -> Strip(14) -> CheckIPHeader() -> rt;
> c4[3] -> Discard;
>
> StaticThreadSched(pd2 0, td1 0, pd1 1, td2 1, pd0 2, td3 2, pd3 3, td0 3);
>
>
> Bingyang
>
> On Mon, Dec 5, 2011 at 3:01 AM, Beyers Cronje <bcronje at gmail.com> wrote:
>
>> For interest sake can you post your the config you are using?
>>
>> On Mon, Dec 5, 2011 at 4:37 AM, Bingyang LIU <bjornliu at gmail.com> wrote:
>>
>> > Hi all,
>> >
>> > I need some help on PollDevice. I found that PollDevice caused some
>> packet
>> > loss rate (less than 1%) even at low input rate (50kpps).
>> >
>> > To be accurate, I found the statistics of switch, which showed that the
>> > number of packets sent out the switch port to the machine's interface
>> was
>> > 20000000, while the "Count" handler of the PollDevice element was
>> 19960660.
>> >
>> > I tuned the driver buffer by "ethtool -G eth0 rx 2096" (default is 256),
>> > but nothing got better.
>> >
>> > Could anyone help me with this?
>> >
>> > Thanks very much.
>> > Bingyang
>> >
>> > On Sun, Dec 4, 2011 at 3:38 PM, Bingyang LIU <bjornliu at gmail.com>
>> wrote:
>> >
>> > > Hi~
>> > >
>> > > I used CPUQueue and found that it didn't drop packets any more. So the
>> > > only problem is that PollDevice drops packets, actually I think it
>> > couldn't
>> > > poll all the ready packets from devices.
>> > >
>> > > Does anyone has similar problem with PollDevice, and is there any
>> > solution
>> > > or best practices?
>> > >
>> > > best
>> > > Bingyang
>> > >
>> > >
>> > > On Sun, Dec 4, 2011 at 2:10 PM, Bingyang LIU <bjornliu at gmail.com>
>> wrote:
>> > >
>> > >> Hi Cliff,
>> > >>
>> > >> I couldn't use multi-threading when using FromDevice. When I used
>> > >> multi-threading and FromDevice together, the system crashed. So I
>> had to
>> > >> use single thread when using FromDevice, and use 4 threads when using
>> > >> PollDevice.
>> > >>
>> > >> The router I tested has four gigabit interfaces, each of which
>> connected
>> > >> with a host. All hosts sent packets to each other at the given rate.
>> > >> When the sending rate is 50kpps (200kpps in total), FromDevice gave
>> the
>> > >> output ratio of 99.74%, while PollDevice gave 99.94%.
>> > >> When the sending rate is 200kpps (800kpps in total), FromDevice only
>> > gave
>> > >> the output ratio of 62.63%, while PollDevice gave 99.29%.
>> > >>
>> > >> That's why I think PollDevice works much better than FromDevice.
>> > >> Actually, both of them cause some packet loss at low input rate.
>> > >>
>> > >> And I think click 1.8 is also a mainline source code. But you are
>> right,
>> > >> I should try 2.0. However, I'm not sure whether the same thing will
>> > happen
>> > >> to 2.0.
>> > >>
>> > >> best
>> > >> Bingyang
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> On Sun, Dec 4, 2011 at 2:29 AM, Cliff Frey <cliff at meraki.com> wrote:
>> > >>
>> > >>> What performance numbers did you see when using FromDevice instead
>> of
>> > >>> PollDevice?
>> > >>>
>> > >>> Have you tried mainline click?
>> > >>>
>> > >>>
>> > >>> On Sat, Dec 3, 2011 at 10:57 PM, Bingyang Liu <bjornliu at gmail.com
>> > >wrote:
>> > >>>
>> > >>>> Thanks Cliff. Ya, I have tried fromdevice, and it gave worse
>> > >>>> performance.
>> > >>>>
>> > >>>> I think Queue should be a very mature element, and there should
>> not be
>> > >>>> a bug there. But the experiment results told me that something got
>> > wrong.
>> > >>>> Should I use a thread safe queue instead of queue, when I use
>> > multithreads?
>> > >>>>
>> > >>>> Thanks
>> > >>>> Bingyang
>> > >>>>
>> > >>>> Sent from my iPhone
>> > >>>>
>> > >>>> On Dec 4, 2011, at 12:31 AM, Cliff Frey <cliff at meraki.com> wrote:
>> > >>>>
>> > >>>> You could try FromDevice instead of PollDevice.  I'd expect that it
>> > >>>> would work fine.  If it is not high performance enough, it would be
>> > great
>> > >>>> if you should share your performance numbers just to have another
>> > datapoint.
>> > >>>>
>> > >>>> I doubt that Queue has a bug, you could try latest click sources
>> > though
>> > >>>> just in case.  As for finding/fixing any polldevice issues, I don't
>> > have
>> > >>>> anything to help you there...
>> > >>>>
>> > >>>> Cliff
>> > >>>>
>> > >>>> On Sat, Dec 3, 2011 at 8:49 PM, Bingyang LIU <bjornliu at gmail.com
>> > >wrote:
>> > >>>>
>> > >>>>> Hi Cliff,
>> > >>>>>
>> > >>>>> Thank you very much for your help. I followed your suggestion and
>> got
>> > >>>>> some results.
>> > >>>>>
>> > >>>>> 1. It turned out that "PollDevice" failed to get all the packets
>> from
>> > >>>>> NIC, even if the packet sending rate is only 200kpps with the
>> packet
>> > size
>> > >>>>> of 64B.
>> > >>>>> 2. I used "grep . /click/.e/*/drops", all of them reported 0
>> drops.
>> > >>>>> 3. I put a counter between every two connected elements, to
>> determine
>> > >>>>> which element dropped packet. Finally I found a queue dropped
>> > packets,
>> > >>>>> because the downstream counter reported less "count" than the
>> > upstream one.
>> > >>>>> However, it was straight that this queue still reported 0 drops. I
>> > think
>> > >>>>> there might be some bug with the element, or I mis-used the
>> elements.
>> > >>>>>
>> > >>>>> So I have two questions. First, how can I make PollDevice work
>> > better,
>> > >>>>> which means that it won't drop packets at low rate. (Should I use
>> > Stride
>> > >>>>> Scheduler?) Second, is there any bug with Queue in Click 1.8.0, in
>> > terms of
>> > >>>>> dropping packets without reporting the drops?
>> > >>>>>
>> > >>>>> My experiment environment and configuration:
>> > >>>>> * Hardware: CPU Inter Xeon X3210 (quad core at 2.13Ghz), 4GB RAM.
>> (a
>> > >>>>> server on deterlab)
>> > >>>>> * Software: Ubuntu8.04 + Click1.8, with PollDevice and
>> > >>>>> multi-thread enabled.
>> > >>>>> * Configuration: ./configure
>> > >>>>> --with-linux=/usr/src/linux-2.6.24.7 --enable-ipsec --enable-warp9
>> > >>>>> --enable-multithread=4
>> > >>>>> * Installation: sudo click-install --thread=4 site7_router1.click
>> > >>>>>
>> > >>>>>  thanks!
>> > >>>>> best
>> > >>>>> Bingyang
>> > >>>>>
>> > >>>>> On Sat, Dec 3, 2011 at 12:42 PM, Cliff Frey <cliff at meraki.com>
>> > wrote:
>> > >>>>>
>> > >>>>
>> > >>>
>> > >>
>> > >>
>> > >> --
>> > >> Bingyang Liu
>> > >> Network Architecture Lab, Network Center,Tsinghua Univ.
>> > >> Beijing, China
>> > >> Home Page: http://netarchlab.tsinghua.edu.cn/~liuby
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Bingyang Liu
>> > > Network Architecture Lab, Network Center,Tsinghua Univ.
>> > > Beijing, China
>> > > Home Page: http://netarchlab.tsinghua.edu.cn/~liuby
>> > >
>> >
>> >
>> >
>> > --
>> > Bingyang Liu
>> > Network Architecture Lab, Network Center,Tsinghua Univ.
>> > Beijing, China
>> > Home Page: http://netarchlab.tsinghua.edu.cn/~liuby
>> > _______________________________________________
>> > click mailing list
>> > click at amsterdam.lcs.mit.edu
>> > https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>> >
>> _______________________________________________
>> click mailing list
>> click at amsterdam.lcs.mit.edu
>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>
>
>
>
> --
> Bingyang Liu
> Network Architecture Lab, Network Center,Tsinghua Univ.
> Beijing, China
> Home Page: http://netarchlab.tsinghua.edu.cn/~liuby
>


More information about the click mailing list