[Click] More on high latency...

Marko Zec zec at icir.org
Sun Apr 3 10:24:21 EDT 2005


On Sunday 03 April 2005 06:13, José María González wrote:
> Hi,
>
> I run into the same problem than Nick and Marko. I wrote a simple BPF
> based program that just reads packets from 2 BPF devices. I tried to
> listen to both of them at the same time by using select(), poll(),
> and kqueue's kevent(). The first 2 worked perfectly, but with kqueue
> I have the same problem than Marko: kevent() returns only after 2
> packets have been received in a BPF descriptor.
>
> I googled for a while, and I found that kqueue seems to have problems
> with BPF when using BIOCIMMEDIATE and/or a BIOCSRTIMEOUT (which Click
> does):
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=64178
>
> [Note: we're using 4.10-RELEASE, and the patched sys/net/bpf.c is not
> there]
>
>
> Unless I'm misunderstanding something (something likely), I find
> arguable whether click does benefit from the use of kqueue, instead
> of the traditional select()/poll() mechanisms.
>
> For starters, kqueue is a new (2002, FreeBSD >= 4.1 or FreeBSD >=
> 5.0) event notification mechanism that substitutes select() and
> poll(). The 2 main differences between kqueue and the previous
> mechanisms are:
>
> - kqueue is designed to scale with the number of descriptors a
> process is listening to. In the original Lemon's paper, the
> performance of using poll() versus using kevent() starts to differ
> significantly when listening to hundrers of descriptors. This should
> not be a problem for click, which captures its traffic at BPF,
> instead of opening a per-connection socket. Unless the user has
> *lots* of FromDevice's (hundreds), it's hard to see how she will
> benefit from kqueue.
>
> - kqueue is designed to be more efficient, by storing per-process
> info in the kernel instead of passing the full descriptor info every
> time a process wants to select()/poll() on them. From the original
> Lemon's paper, it seems the cost of a call to kevent() (the kqueue
> equivalent to select()/poll() ) is significantly lower when listening
> to 100 descriptors. It may be the case that the cost is also
> different when listening to a handful of descriptor, but I doubt it
> (if only because the author would have reported it ;)
>
> Again, I don't think this will be a problem for click users (I use a
> pretty complicated click setup in my project, and still only use 10
> BPF descriptors).
>
> [Note: The original Lemon's paper is available at
>
> http://people.freebsd.org/~jlemon/papers/kqueue.ps ]
>
>
> The solution should be pretty straightforward: disable kqueue use
> unless explicitly requested by the user. Unless the user explicitly
> requests a CLI/compilation flag, change line 46 at lib/master.cc as
> follows:
>
> -    _kqueue = kqueue();
> +    _kqueue = -1;


Hi Chema,

I can second that both the patch from the FreeBSD-PR link you posted 
above and a similar one I've sent to Nick a few days ago do fix the 
issue in 4.10 and 4.11 kernels.  Nevertheless, at least the userspace 
Click should work out of the box without messing with kernel patches, 
so I'm adding my vote for temporarily disabling the kqueue path in 
Click until the bpf+kqueue kernel-level problem gets fixed in official 
FreeBSD trees.  

Cheers,

Marko


> Regards.
> -Chema
>
> Marko Zec wrote:
> > On Monday 28 March 2005 22:03, Nicholas Weaver wrote:
> > > The high latency is definatly occuring within pcap:
> > >
> > > Printing the time just after pcap_dispatch calls
> > > FromDevice_get_packet shows that this is the source of the
> > > injected latency.
> > >
> > > I'm going to try compiling and linking to a different version of
> > > pcap and see if it makes a difference.
> >
> > Hi Nick,
> >
> > following your first report on this issue a week ago, I did some
> > digging...  My initial finding was that the problem might actually
> > be with the kqueue + bpf not playing nicely together in the FreeBSD
> > kernel, so this might have nothing to do with neither Click nor
> > libpcap.
> >
> > What I observed in the kernel was that on each received packet the
> > bpf code correctly tried to wake up the listening process using a
> > kqueue macro, but the kqueue code for an unknown reason coalesced
> > two such events before actually doing the wakeup call.  Having no
> > clue how the kqueue kernel infrastructure is supposed to work
> > internally I stopped at that point, and then went into a suspend
> > mode for a week after becoming a happy father last weekend...  Will
> > look more into this one of these days.
> >
> > Cheers,
> >
> > Marko
> > _______________________________________________
> > click mailing list
> > click at amsterdam.lcs.mit.edu
> > https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>
> _______________________________________________
> click mailing list
> click at amsterdam.lcs.mit.edu
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click



More information about the click mailing list