[Click] Multithreading bug

Eddie Kohler kohler at cs.ucla.edu
Wed May 11 11:47:34 EDT 2011


Meaning that you've tried it and you no longer see a bug, so I should check it in?

E


On 5/11/11 6:30 AM, Beyers Cronje wrote:
> Perfect, that works for me.
>
> On Wed, May 11, 2011 at 3:25 PM, Eddie Kohler <kohler at cs.ucla.edu
> <mailto:kohler at cs.ucla.edu>> wrote:
>
>     Good catch!  This is a bug in Socket.  The fast_reschedule() method should
>     only be called when an element is truly being run from the scheduler.  (I
>     guess the documentation is a bit ambiguous.)  It is safe to call
>     reschedule() from anywhere; I would suggest just changing
>     Socket::run_task() to call reschedule().
>
>     Eddie
>
>
>
>     On 5/11/11 6:09 AM, Beyers Cronje wrote:
>
>         Hi guys,
>
>         I seem to have come across a userlevel click multithreading bug. In short,
>         when running multiple threads in userlevel click with Socket element
>         it can
>         happen that two separate threads end up in fast_reschedule() at the same
>         time causing click to crash at the following line: task.hh:558   while
>         (n !=
>         _thread&&  !PASS_GT(n->_pass, _pass))
>
>         One thread runs Socket::run_task() via standard task scheduling, while the
>         second thread call Socket::run_task() via Socket::selected() scheduling.
>         This will obviously affect any other element that calls run_task() via
>         selected().
>
>         Eddie, Cliff, should this issue be addressed inside Socket? Or should we
>         look at how RouterThread handles locking with regards to task and select
>         scheduling?
>
>         Below the stack traces: 7 threads, with thread 1 and 6 being the culprits.
>
>         (gdb) info threads
>            7 Thread 23122  0x000000392cedcee3 in select () from /lib64/libc.so.6
>            6 Thread 23121  0x000000000054ce1b in fast_reschedule
>         (this=0x14571e0) at
>         ../include/click/task.hh:558
>            5 Thread 23125  0x000000000058471b in RouterThread::driver
>         (this=0x1445bb0) at ../lib/routerthread.cc:565
>            4 Thread 23120  FromDAG::run_task (this=0x1456990) at fromdag.cc:157
>            3 Thread 23112  0x0000000000560305 in Packet::~Packet (this=0x14b22f0,
>         __in_chrg=<value optimized out>) at ../lib/packet.cc:181
>            2 Thread 23124  0x00000000005844b0 in RouterThread::driver
>         (this=0x1445af0) at ../lib/routerthread.cc:594
>         * 1 Thread 23123  0x000000000054ce1b in fast_reschedule
>         (this=0x14571e0) at
>         ../include/click/task.hh:558
>
>         (gdb) thread 1
>         [Switching to thread 1 (Thread 23123)]#0  0x000000000054ce1b in
>         fast_reschedule (this=0x14571e0) at ../include/click/task.hh:558
>         558             while (n != _thread&&  !PASS_GT(n->_pass, _pass))
>         (gdb) bt
>         #0  0x000000000054ce1b in fast_reschedule (this=0x14571e0) at
>         ../include/click/task.hh:558
>         #1  Socket::run_task (this=0x14571e0) at
>         ../elements/userlevel/socket.cc:524
>         #2  0x00000000005845d6 in fire (this=0x1445a30) at
>         ../include/click/task.hh:612
>         #3  run_tasks (this=0x1445a30) at ../lib/routerthread.cc:405
>         #4  RouterThread::driver (this=0x1445a30) at ../lib/routerthread.cc:594
>         #5  0x0000000000556e39 in thread_driver (user_data=<value optimized
>         out>) at
>         click.cc:414
>         #6  0x000000392d206d5b in start_thread () from /lib64/libpthread.so.0
>         #7  0x000000392cee4aad in clone () from /lib64/libc.so.6
>
>         (gdb) thread 6
>         [Switching to thread 6 (Thread 23121)]#0  0x000000000054ce1b in
>         fast_reschedule (this=0x14571e0) at ../include/click/task.hh:558
>         558             while (n != _thread&&  !PASS_GT(n->_pass, _pass))
>         (gdb) bt
>         #0  0x000000000054ce1b in fast_reschedule (this=0x14571e0) at
>         ../include/click/task.hh:558
>         #1  Socket::run_task (this=0x14571e0) at
>         ../elements/userlevel/socket.cc:524
>         #2  0x000000000054c2cf in Socket::selected (this=0x14571e0, fd=<value
>         optimized out>) at ../elements/userlevel/socket.cc:417
>         #3  0x0000000000592856 in call_selected (this=0x1444c50, thread=<value
>         optimized out>, more_tasks=<value optimized out>) at ../lib/master.cc:732
>         #4  Master::run_selects_poll (this=0x1444c50, thread=<value optimized
>         out>,
>         more_tasks=<value optimized out>) at ../lib/master.cc:889
>         #5  0x0000000000592f34 in Master::run_selects (this=0x1444c50,
>         thread=0x14458b0) at ../lib/master.cc:1050
>         #6  0x00000000005847d7 in run_os (this=0x14458b0) at
>         ../lib/routerthread.cc:442
>         #7  RouterThread::driver (this=0x14458b0) at ../lib/routerthread.cc:562
>         #8  0x0000000000556e39 in thread_driver (user_data=<value optimized
>         out>) at
>         click.cc:414
>         #9  0x000000392d206d5b in start_thread () from /lib64/libpthread.so.0
>         #10 0x000000392cee4aad in clone () from /lib64/libc.so.6
>
>
>         Beyers
>         _______________________________________________
>         click mailing list
>         click at amsterdam.lcs.mit.edu <mailto:click at amsterdam.lcs.mit.edu>
>         https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>
>


More information about the click mailing list