[Click] Core performance checkins

Beyers Cronje bcronje at gmail.com
Wed Feb 9 08:22:04 EST 2011


Hi Eddie,

Some info that I'm sure will help debugging. I configured click
with  --enable-schedule-debugging=extra and also enabled NOTIFIERQUEUE_DEBUG

It seems Unqueue gets stuck in the pending list. See the output below:

read q.length
1000

read q.notifier_state
notifier on
task 0x19387b0 [uq :: Unqueue] scheduled

read uq.scheduled
true /* but pending */

read uq.notifier
empty.0/1:1*


Unqueue stays in this scheduled but pending state for an undetermined period
of time, up to minutes some times.

Any idea where I can start on fixing this bug?

Beyers


On Wed, Feb 9, 2011 at 2:52 AM, Beyers Cronje <bcronje at gmail.com> wrote:

> Update - Strange, after typing my previous email I checked again, and all
> of a sudden Unqueue was pulling packets again. Not sure if me breaking into
> Click with gdb kick-started, it again :) or if it's a intermittent issue.
>
>
> On Wed, Feb 9, 2011 at 2:41 AM, Beyers Cronje <bcronje at gmail.com> wrote:
>
>> Hi Eddie,
>>
>> Since running this merge I've been experiencing issues with usermode
>> multithreading. I'm using commit 9419098acbdc20837e37f3033c40661809431f8d
>> I do believe the issues are related to the changes of this merged, as I
>> was running the same config on pre-coreperformance merge code without any
>> issues.
>>
>> *Simplified Config used:*
>>
>> fd::FromDAG(/dev/dag0) -> cl1::Classifier(12/0800) -> MarkIPHeader(14) ->
>> ipc1::IPClassifier(udp port 1646);
>> sfp::SFP();
>> q::ThreadSafeQueue;
>> log::Logger(sfp, 1800);
>>
>> ipc1 -> RadAccounting -> q;
>> q -> uq::Unqueue -> aupcc::Counter -> sfp;
>>
>> StaticThreadSched(fd 0, uq 1, log 2);
>>
>> *Problem Description:*
>> The 3 threads in the config: FromDAG and Unqueue do what their names
>> indicate, and Logger has a timer that schedules its task once every 30min.
>> Everything runs fine initially and then somewhere along the line Unqueue
>> stops pulling packets from the queue which leads to a constant queue
>> overflow. When this happens I can see FromDAG is still working as q.drops
>> increases constantly, and Logger also fires away every 30min. Note push rate
>> from FromDAG is quite high, but what RadAccounting pushes into queue is very
>> low ~ 100pps, which means queue is most of the time empty and Unqueue is not
>> scheduled.
>>
>> Unfortunately I didn't configure debug scheduling, which would've helped.
>> I did notice Unqueue.scheduled always returned true, even though it never
>> actually ran. Not sure if the following will help, gdb shows the following:
>>
>> (gdb) info threads
>>   3 Thread 0x7fa422559700 (LWP 27205)  0x000000392cedb0b3 in poll () from
>> /lib64/libc.so.6
>>   2 Thread 0x7fa421d58700 (LWP 27206)  0x000000392cedcee3 in select ()
>> from /lib64/libc.so.6
>> * 1 Thread 0x7fa432cfb740 (LWP 27197)  FromDAG::process_packet
>> (this=0x1afc570, erf_record=0x7fa423258bf0, rlen=<value optimized out>) at
>> ../elements/local/fromdag.cc:193
>> (gdb) thread 2
>> [Switching to thread 2 (Thread 0x7fa421d58700 (LWP 27206))]#0
>>  0x000000392cedb0b3 in poll () from /lib64/libc.so.6
>> (gdb) bt
>> #0  0x000000392cedb0b3 in poll () from /lib64/libc.so.6
>> #1  0x000000000058bc36 in Master::run_selects_poll (this=0x1afacc0,
>> thread=0x1afb8b0, more_tasks=false) at ../lib/master.cc:862
>> #2  0x000000000058c4ec in Master::run_selects (this=0x1afacc0,
>> thread=0x1afb8b0) at ../lib/master.cc:1050
>> #3  0x000000000057ddf1 in run_os (this=0x1afb8b0) at
>> ../lib/routerthread.cc:447
>> #4  RouterThread::driver (this=0x1afb8b0) at ../lib/routerthread.cc:568
>> #5  0x0000000000556dc9 in thread_driver (user_data=<value optimized out>)
>> at click.cc:414
>> #6  0x000000392d206d5b in start_thread () from /lib64/libpthread.so.0
>> #7  0x000000392cee4aad in clone () from /lib64/libc.so.6
>> (gdb) thread 3
>> [Switching to thread 3 (Thread 0x7fa422559700 (LWP 27205))]#0
>>  0x000000392cedcee3 in select () from /lib64/libc.so.6
>> (gdb) bt
>> #0  0x000000392cedcee3 in select () from /lib64/libc.so.6
>> #1  0x000000000058c4bf in Master::run_selects (this=0x1afacc0,
>> thread=0x1afb7e0) at ../lib/master.cc:1015
>> #2  0x000000000057ddf1 in run_os (this=0x1afb7e0) at
>> ../lib/routerthread.cc:447
>> #3  RouterThread::driver (this=0x1afb7e0) at ../lib/routerthread.cc:568
>> #4  0x0000000000556dc9 in thread_driver (user_data=<value optimized out>)
>> at click.cc:414
>> #5  0x000000392d206d5b in start_thread () from /lib64/libpthread.so.0
>> #6  0x000000392cee4aad in clone () from /lib64/libc.so.6
>> (gdb) thread 1
>> [Switching to thread 1 (Thread 0x7fa432cfb740 (LWP 27197))]#0
>>  0x000000392d20ebfd in nanosleep () from /lib64/libpthread.so.0
>> (gdb) bt
>> #0  FromDAG::process_packet (this=0x1afc570, erf_record=0x7fa423258bf0,
>> rlen=<value optimized out>) at ../elements/local/fromdag.cc:193
>> #1  0x00000000004d6402 in FromDAG::run_task (this=0x1afc570) at
>> ../elements/local/fromdag.cc:150
>> #2  0x000000000057dbe6 in fire (this=0x1afb710) at
>> ../include/click/task.hh:612
>> #3  run_tasks (this=0x1afb710) at ../lib/routerthread.cc:410
>> #4  RouterThread::driver (this=0x1afb710) at ../lib/routerthread.cc:600
>> #5  0x0000000000558513 in main (argc=<value optimized out>, argv=<value
>> optimized out>) at click.cc:639
>>
>> One thing to note, for various reasons I'm doing something very
>> unclick-like with FromDAG where I allow it to block up to 10ms. For my
>> specific requirements this is not a problem, but just in case it might
>> affect the way the new task handling operates it's worth noting.
>>
>> Beyers
>>
>> On Sun, Feb 6, 2011 at 1:23 AM, Eddie Kohler <kohler at cs.ucla.edu> wrote:
>>
>>> Hi all,
>>>
>>> This is just a note to say that I've merged the "coreperformance" branch
>>> with master.  There are several changes that may speed up particularly
>>> simple configurations, and that (more importantly) may make it easier to
>>> experiment with different multithreading setups.  (For instance I
>>> believe switching a task from one thread to another is faster now.)  Let
>>> me know if you experience any problems
>>>
>>> Eddie
>>> _______________________________________________
>>> click mailing list
>>> click at amsterdam.lcs.mit.edu
>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>
>>
>>
>


More information about the click mailing list