[Click] Problems with RatedUnqueue

Eddie Kohler kohler at cs.ucla.edu
Wed Mar 4 21:56:11 EST 2009


Hi Øivind,

Øivind Kure wrote:
> Eddie,
> The problem occurred when I reduced the rate from 150 pck a second and down.  In the previous mail, I just reported a brief test, without any real workload. Then I was forced down to 1 pck a second to reproduce the problem. If it is of interest I can repeat the test under the actual workload used the first time, but it may take some time. What is puzzling me is that under high rates, the CPU load is low. Once I reduce the rate, the cpu load increases drastically and the machine becomes CPU bound. I would have expected a constant load. Do you have any hints to why tasks behave like this.
> What I am looking for is a link emulator. Burster would work, but it adds a substantial jitter. Thanks for the pointer, and I will look into it.

The RatedUnqueue element works by rescheduling itself constantly until it is 
allowed to emit a packet.  This busy-waiting increases CPU utilization.

RatedUnqueue does not slow down the config when rates are higher, because 
RatedUnqueue is smart enough not to reschedule itself when it has no work to 
do (i.e. upstream Queues are empty).  This is called "notification" in Click. 
  Thus, at high rates, RatedUnqueue generally sleeps for a while, then wakes 
up, reschedules itself briefly (for time inversely proportional to the rate), 
then emits the packet.  At low rates, though, the queue is always full, so 
RatedUnqueue busy-waits.

> I have an unrelated additional problem that I have not seen a published solution to. I get the error Assertion failed....... CLEANUP_CONFIGURED. All elements compile without error. The element that fails only sets a few variables in the initialization. All elements I have written return 0 in the configure function and they worked as intended under 1.5.  Do you have a recommended strategy for finding the offending piece of code. My current strategy is to code a basic element and then added functionality stepwise until I find the offending lines of code. However, it is not a fast approach.

This sounds very much like memory corruption most probably induced by your 
element.  That assertion should never happen.  valgrind is useful to search 
for memory errors.

Eddie




> Thanks
> Øivind
> 
> -----Original Message-----
> From: Eddie Kohler [mailto:kohler at cs.ucla.edu] 
> Sent: 23. februar 2009 17:48
> To: Øivind Kure
> Cc: Click
> Subject: Re: [Click] Problems with RatedUnqueue
> 
> Øivind,
> 
> Ah, I had not realized you were trying to use such low rates.  For very low 
> rates like this RatedUnqueue is not currently a good choice of element.  The 
> RatedUnqueue will, as you observe, spin endlessly while waiting to emit a 
> packet.  Burster is a much better choice for very low rates.  While we might 
> fix this problem eventually, we won't soon.  People interested in fixing it 
> should look at elements like LinkUnqueue that use combinations of Timers and 
> Tasks.
> 
> Eddie
> 
> 
> Øivind Kure wrote:
>> Eddie,
>> I only did a brief test this afternoon. I have not had time to copy the original test scenario. However, with the setup shaping just the broadcast packet coming over the interface I got the same results once I set the shaper to the extreme range of 1 pck per sec. The machine became CPU bound. Increasing the shaping rate removed the problem again.
>>
>> Regards
>> Øivind
>>
>> -----Original Message-----
>> From: Eddie Kohler [mailto:kohler at cs.ucla.edu] 
>> Sent: 23. februar 2009 08:33
>> To: Øivind Kure
>> Cc: click at pdos.csail.mit.edu; Bart Braem
>> Subject: Re: [Click] Problems with RatedUnqueue
>>
>> Øivind,
>>
>> Did you ever try the newer Git sources to verify whether this problem still 
>> occurs there?
>>
>> Eddie
>>
>>
>> Øivind Kure wrote:
>>> Hi,
>>> I run click in userspace (on a Suse 10.0 machine), essentially as a link emulator.
>>> The essence of the configuration is Fromdevice ->Queue -> shaper -> Todevice.
>>> In addition I use  a standard setup with Classifier, IPClassifier, CheckIPHeader, ArpResponder and ArpQuerier, so the configuration acts as a forwarding element.
>>>
>>> For the shaper element I started out with RatedUnqueue . 
>>> The configuration is used in a controlled environment where the load is  1,5 Mbit/sec MPEG2 video or roughly 150 packets a second.  The configuration works with no problem. However, when I reduce the rate of the shaper element (f. ex 30) , the configuration become CPU bound. Packets are dropped from the video, but the queuing element before the shaper reports 0 drops. The cpu load reported by top increases to 0% idle and almost 100% to click. The machine remains cpu bound until the rate in the shaper element is increased to 250, well above the offered load.
>>> This problem has been observed for click 1.5 and click 1.6.  I have also observed similar problems on other linux versions.
>>>
>>> If I replace the shaper element with Burster ( which is timer based) , the problem disappears. When the rate in the shaper element  is reduced, the queue starts dropping packets, as is shoul. The cpu load remains almost constant and low.
>>>
>>> It might be designed feature I have missed, a bug, or I might have misunderstood something basic. Any explanation to the observed behaviour will be appreciated. 
>>> Regards
>>> Øivind Kure
>>>
>>> _______________________________________________
>>> click mailing list
>>> click at amsterdam.lcs.mit.edu
>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click



More information about the click mailing list