[Click] NotifierQueue on multi-threaded Click problem.

Eddie Kohler kohler at cs.ucla.edu
Fri Sep 14 01:23:25 EDT 2007


Click is lucky to have you!

There were a couple other cases of the same bug that your fix didn't catch; I 
checked in a broader fix.  Does it work?

E


Joonwoo Park wrote:
> Well..
> It's hard to agree with slowing down code.
> BTW Do I need something like this?
> Recently move_thread can make another thread's sleeping.
> 
> diff --git a/lib/task.cc b/lib/task.cc
> index 197859c..a087b3b 100644
> --- a/lib/task.cc
> +++ b/lib/task.cc
> @@ -383,8 +383,8 @@ Task::move_thread(int thread_id)
>         if (old_thread->thread_id() != _home_thread_id
>             && old_thread->thread_id() != RouterThread::THREAD_STRONG_UNSCHEDULE
> ) {
>             if (scheduled()) {
> -               _should_be_scheduled = true;
>                 fast_unschedule();
> +               _should_be_scheduled = true;
>             }
>             _thread = _router->master()->thread(_home_thread_id);
>             old_thread->unlock_tasks();
> 
> Joonwoo Park
> 
> 2007/9/14, Eddie Kohler <kohler at cs.ucla.edu>:
>> AARGH!
>>
>> Well, thanks as usual for the bug report.
>>
>> I think though that this is one of those cases where your lock does not
>> actually fix the problem, it just makes the race condition much less
>> likely by slowing down the code.
>>
>> I have attempted to fix the problem in a different way.  The commit
>> message has more.  Let me know how it does....
>> Eddie
>>
>>
>>
>>
>> Joonwoo Park wrote:
>>> Unfortunately, A few hours ago I found that my test machine has gone
>>> into hibernation.
>>> It had worked for about 3 ~ 4 days maybe.
>>> And I found that I can reproduce the problem in short time with
>>> SLEEPINESS_TRIGGER=1.
>>> IMHO, I think process_pending() still has problem.
>>> When another thread's Task::add_pending see 'if (!_pending_nextptr)'
>>> as FALSE, at the same time if we set it to 0, thread->add_pending()
>>> cannot be called again (only if _pending_reschedule is FALSE).
>>> As a result, It's seems that a routerthread go sleep forever.
>>> I tried patch and It works fine although SLEEPINESS_TRIGGER=1. (I have
>>> never seen before working SLEEPINESS_TRIGGER=1)
>>> How about this?
>>>
>>> -
>>> Signed-off-by: Joonwoo Park <joonwpark81 at gmail.com>
>>>
>>>  lib/master.cc |    8 ++++++++
>>>  1 files changed, 8 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/lib/master.cc b/lib/master.cc
>>> index 5a56371..cef62be 100644
>>> --- a/lib/master.cc
>>> +++ b/lib/master.cc
>>> @@ -370,7 +370,15 @@ Master::process_pending(RouterThread *thread)
>>>      // process the list
>>>      while (Task *t = Task::pending_to_task(my_pending)) {
>>>       my_pending = t->_pending_nextptr;
>>> +     // when another thread's Task::add_pending see 'if (!_pending_nextptr)'
>>> +     // as FALSE, at the same time if we set it to 0, thread->add_pending()
>>> +     // cannot be called again (only if _pending_reschedule is 0).
>>> +     bool lock = !t->_pending_reschedule;
>>> +     if (lock)
>>> +             flags = _task_lock.acquire();
>>>       t->_pending_nextptr = 0;
>>> +     if (lock)
>>> +             _task_lock.release(flags);
>>>       t->process_pending(thread);
>>>      }
>>>  }
>>> -
>>>
>>> Joonwoo Park


More information about the click mailing list