[Click] NotifierQueue on multi-threaded Click problem.
Eddie Kohler
kohler at cs.ucla.edu
Fri Sep 14 01:23:25 EDT 2007
Click is lucky to have you!
There were a couple other cases of the same bug that your fix didn't catch; I
checked in a broader fix. Does it work?
E
Joonwoo Park wrote:
> Well..
> It's hard to agree with slowing down code.
> BTW Do I need something like this?
> Recently move_thread can make another thread's sleeping.
>
> diff --git a/lib/task.cc b/lib/task.cc
> index 197859c..a087b3b 100644
> --- a/lib/task.cc
> +++ b/lib/task.cc
> @@ -383,8 +383,8 @@ Task::move_thread(int thread_id)
> if (old_thread->thread_id() != _home_thread_id
> && old_thread->thread_id() != RouterThread::THREAD_STRONG_UNSCHEDULE
> ) {
> if (scheduled()) {
> - _should_be_scheduled = true;
> fast_unschedule();
> + _should_be_scheduled = true;
> }
> _thread = _router->master()->thread(_home_thread_id);
> old_thread->unlock_tasks();
>
> Joonwoo Park
>
> 2007/9/14, Eddie Kohler <kohler at cs.ucla.edu>:
>> AARGH!
>>
>> Well, thanks as usual for the bug report.
>>
>> I think though that this is one of those cases where your lock does not
>> actually fix the problem, it just makes the race condition much less
>> likely by slowing down the code.
>>
>> I have attempted to fix the problem in a different way. The commit
>> message has more. Let me know how it does....
>> Eddie
>>
>>
>>
>>
>> Joonwoo Park wrote:
>>> Unfortunately, A few hours ago I found that my test machine has gone
>>> into hibernation.
>>> It had worked for about 3 ~ 4 days maybe.
>>> And I found that I can reproduce the problem in short time with
>>> SLEEPINESS_TRIGGER=1.
>>> IMHO, I think process_pending() still has problem.
>>> When another thread's Task::add_pending see 'if (!_pending_nextptr)'
>>> as FALSE, at the same time if we set it to 0, thread->add_pending()
>>> cannot be called again (only if _pending_reschedule is FALSE).
>>> As a result, It's seems that a routerthread go sleep forever.
>>> I tried patch and It works fine although SLEEPINESS_TRIGGER=1. (I have
>>> never seen before working SLEEPINESS_TRIGGER=1)
>>> How about this?
>>>
>>> -
>>> Signed-off-by: Joonwoo Park <joonwpark81 at gmail.com>
>>>
>>> lib/master.cc | 8 ++++++++
>>> 1 files changed, 8 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/lib/master.cc b/lib/master.cc
>>> index 5a56371..cef62be 100644
>>> --- a/lib/master.cc
>>> +++ b/lib/master.cc
>>> @@ -370,7 +370,15 @@ Master::process_pending(RouterThread *thread)
>>> // process the list
>>> while (Task *t = Task::pending_to_task(my_pending)) {
>>> my_pending = t->_pending_nextptr;
>>> + // when another thread's Task::add_pending see 'if (!_pending_nextptr)'
>>> + // as FALSE, at the same time if we set it to 0, thread->add_pending()
>>> + // cannot be called again (only if _pending_reschedule is 0).
>>> + bool lock = !t->_pending_reschedule;
>>> + if (lock)
>>> + flags = _task_lock.acquire();
>>> t->_pending_nextptr = 0;
>>> + if (lock)
>>> + _task_lock.release(flags);
>>> t->process_pending(thread);
>>> }
>>> }
>>> -
>>>
>>> Joonwoo Park
More information about the click
mailing list