[Click] NotifierQueue on multi-threaded Click problem.

Eddie Kohler kohler at cs.ucla.edu
Fri Sep 7 16:44:14 EDT 2007


Great news!  OK, the version with my patch is checked in.  Until you let me 
know further, I will assume that the issues with notification in SMP CLick 
configurations are fixed. :)

Thank you very, very, very much for your testing help and engagement.  I 
really appreciate it.

Eddie


Joonwoo Park wrote:
> Yep.
> 
> 2007/9/8, Eddie Kohler <kohler at cs.ucla.edu>:
>> OK, too bad.
>>
>> I still doubt that replacing _head and _tail with atomic_uint32_t would fix
>> the underlying problem.  The atomic_uint32_t, in kernel, is read with
>> atomic_read() and written with atomic_set().  Here's how they are implemented
>> on x86 (LINUX/include/asm-i386/atomic.h):
>>
>> typedef struct { volatile int counter; } atomic_t;
>> #define atomic_read(v)          ((v)->counter)
>> #define atomic_set(v,i)         (((v)->counter) = (i))
>>
>> I.e., just with reads and assignments!  Once we declare the _head and _tail
>> variables as volatile, it's the same as atomic_uint32_t for reads and
>> assignments.  (Atomic_uint32_t DOES make a difference for operations like +=,
>> -=, &=, dec_and_test; but _head and _tail are never touched that way.)
>>
>> I am happy to be proved wrong, as usual; can you explain in more detail why
>> atomic operations would help?  I am looking for a sample list of interleaved
>> instructions that would cause an error.
> 
> Um. please understand the reason atomic_t worked to me as alternative
> to volatile.
> Being stupid, I didn't think that _tail and _head just do load & store
> that can be run atomically.
> BTW I'm running i386 so these load & store operations always run
> atomically as like you said, but for other arch I think that atomic
> type value would be needed.
> 
>> -*-
>>
>> But: There are some remaining places where I can see potential race conditions
>> involving notification.  For example:
>>
>> Initial state:   _head = 0, _tail = 2, _capacity = 1000
>> Thread 1 push(): int h = _head, t = _tail, nt = next_i(t);
>>                  => int h = 0, t = 2, nt = 3;
>>                  => Thread 1 sleeps
>> Thread 2 pull(): (executes once)
>>                  => _head = 1, _tail = 2
>> Thread 2 pull(): (executes once)
>>                  => _head = 2, _tail = 2
>> Thread 2 pull(): (executes, returns null, _sleepiness++)
>> Thread 2 pull(): => _empty_note.sleep()
>> Thread 1 push(): => wakes up
>>                  => executes "if (s == 1) _empty_note.wake();"
>>                  => But s == 3!
>>                  => So notifier is not woken, the bug happens.
>>
>> I have attached a diff to address these problems.  Does this do anything?
>>
>> Eddie
>>
>>
>>
> 
> I can say that your patch has been working 10 hours!!, even though you
> posted it just 10 mins before.
> Because I have been testing patch similar to yours.
> 
> I have tried with
> if (s == 1 || s == 2 || s ==3 )
> _empty_note.wake();
> 
> And clear git src + your patch is working nicely too!!
> 
> Joonwoo Park


More information about the click mailing list