[Click] User level Click Queues vs Kernel Queues
Eddie Kohler
kohler at cs.ucla.edu
Wed Jul 6 03:57:42 EDT 2005
Hi Michael,
On Jun 11, 2005, at 2:19 PM, Michael Sirivianos wrote:
> Hi,
>
> We are trying to setup a simple experiment with userlevel click to
> measure the performance degradation of a router performing some crypto
> computations.
>
> We would prefer to use it at kernel space, but 2.4 kernels do not
> really
> work on our PC's (unless we spend all our time trying to choose the
> correct compiling options) and due to gcc/glib incompatibilities we
> cannot even compile them.
I agree with Beyers that this might be worth your time; also, Click
now works on 2.6 kernels (mostly).
> our configuration is sthng like:
>
> fromDevice->classifier-> Queue(200) -> Unqueue->StripIpheader-
>
>> ...OurElement->ARPQuerier.
>>
>
> However, we observe that no matter how expensive the computation in
> OurElement is and no matter how much the sender rate increases, the
> Queue never builds up over 1 packet. Instead we have packet losses,
> that
> we have not been able to pinpoint exactly where they happen. By the
> network monitor tools we infer, its not in the Ethernet interface, and
> most likely not in the IP input queue at the kernel.
This does not surprise me. Userlevel Click is single-threaded. (So
is kernel Click without --enable-multithread.) What is happening is
probably something like this:
1. FromDevice reads a single packet, emits to Queue
2. Unqueue runs, pushes to OurElement
OurElement runs for a really really long time
In the meantime, the k->u queue fills up and eventually overflows
3. Repeat 1-2 indefinitely.
> We theorize that we may have drops at a kernel/userspace queue but we
> would like an opinion on that.
I buy it.
> Am I correct to assume that a frame is directly forwarded from the
> device to a kernel queue and then userlevel click which is a single
> thread reads the frame from this in-kernel queue?
Yes.
> Then click proceeds
> with processing the packets at the rest of the modules including our
> Queue element? Thus, the same thread puts the packet in the click
> queue
> and then processes it. This is the only explanation I could find
> for not
> being able to build up our Click queue over 1 packet.
Yes!
> Is it impossible to conduct this experiment at the userlevel?
> At the kernel level would the NIC be the producer of packets that
> directly places frame in our Click Queue and the Elements Unqueue->...
> would be the consumers?
Well, not quite. The NIC would produce packets, but a Click thread
would take them from the NIC and push them to the Queue.
If you really want effective parallelism, there are two ways to go.
1. SMP kernel + --enable-multithread + a thread schedule that forces
FromDevice and Unqueue to run on different CPUs.
2. (Even better:) Rewrite your long-running element so that it
doesn't block. For example, your element could contain a Task and
Unqueue functionality. When it had nothing to do, it would read a
packet from upstream (like Unqueue does), start processing that
packet. But during a long-running computation, it would periodically
yield control, and resume where it left off when it is next
scheduled. (I.e. event-driven style!)
I would do 2. It is probably a bit easier than it appears.
Eddie
>
> Thanks,
> Michael
>
>
> _______________________________________________
> click mailing list
> click at amsterdam.lcs.mit.edu
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>
More information about the click
mailing list