[Click] NotifierQueue on multi-threaded Click problem.

Joonwoo Park joonwpark81 at gmail.com
Wed Sep 5 23:09:49 EDT 2007


Hi Eddie,

2007/9/5, Eddie Kohler <kohler at cs.ucla.edu>:
> Hi Joonwoo,
>
> Joonwoo Park wrote:
> > There are some problems.
> > The Task::process_pending() calls add_pending() which is reference master's
> > member variable internally.
> > So to call Task::process_pending(), _task_lock is need to acquired or
> > synchronized by another method.
>
> There was a problem here, but not the problem you describe.  If two or more
> threads called Master::process_pending() at the same time, the whole locking
> structure would break down.  (Duh.)
>
> >> This change is checked in.  Does it work for you?
> >
> > I think it has problems too.
> > Memorizing my test of the last month, It might _empty_note is need to
> > be synchronized.
> > IMO the puller and pusher is watching notifier race-conditionally.
>
> It seems that you are basing this on reading code rather than tests.  Can you
> confirm that there is a problem in tests?

There is a close connection between the queue and pending.
So I didn't test it. (And had no time...)
But, at this time I can confirm it has problems. (based on test, plz see below)

>
> > I tried a patch and It's doing lock-less way.
> > It passes pointers of master's member to process_pending(add_pending)
> > to prevent pollution of another thread.
> > How do you think?
>
> See above; I think this patch doesn't fix the true problem.  (The failed
> assertion was the true problem.)
>
> I have checked in a different patch, though, that is at least getting closer.
>  Comment discussion and diff here:
>
> http://www.read.cs.ucla.edu/gitweb?p=click;a=commit;h=ba637c51282065cdae499b9f5cb5320950c11271
>
> Does this work better for you?  Do you understand what it is doing?

Obviously It works better than before!!!
I prepared test bed again that was blew up last month and tested it again.
Therefore It has worked over 12hours.

But, to working, spinlock is needed to the queue. (between pull and push).
Without the mods, I got failure. The queue stops notification in some seconds.

In addition to that, I modified SLEEPINESS_TRIGGER of the queue to 2
from 9 to make more race-conditional situation.
Tantalizingly, It failed in half an hour.
As test method, is this OK? How do you think?

>
> Eddie
>

Jason Park (Joonwoo Park)


More information about the click mailing list