[Click] NotifierQueue on multi-threaded Click problem.
Eddie Kohler
kohler at cs.ucla.edu
Fri Sep 7 16:44:14 EDT 2007
Great news! OK, the version with my patch is checked in. Until you let me
know further, I will assume that the issues with notification in SMP CLick
configurations are fixed. :)
Thank you very, very, very much for your testing help and engagement. I
really appreciate it.
Eddie
Joonwoo Park wrote:
> Yep.
>
> 2007/9/8, Eddie Kohler <kohler at cs.ucla.edu>:
>> OK, too bad.
>>
>> I still doubt that replacing _head and _tail with atomic_uint32_t would fix
>> the underlying problem. The atomic_uint32_t, in kernel, is read with
>> atomic_read() and written with atomic_set(). Here's how they are implemented
>> on x86 (LINUX/include/asm-i386/atomic.h):
>>
>> typedef struct { volatile int counter; } atomic_t;
>> #define atomic_read(v) ((v)->counter)
>> #define atomic_set(v,i) (((v)->counter) = (i))
>>
>> I.e., just with reads and assignments! Once we declare the _head and _tail
>> variables as volatile, it's the same as atomic_uint32_t for reads and
>> assignments. (Atomic_uint32_t DOES make a difference for operations like +=,
>> -=, &=, dec_and_test; but _head and _tail are never touched that way.)
>>
>> I am happy to be proved wrong, as usual; can you explain in more detail why
>> atomic operations would help? I am looking for a sample list of interleaved
>> instructions that would cause an error.
>
> Um. please understand the reason atomic_t worked to me as alternative
> to volatile.
> Being stupid, I didn't think that _tail and _head just do load & store
> that can be run atomically.
> BTW I'm running i386 so these load & store operations always run
> atomically as like you said, but for other arch I think that atomic
> type value would be needed.
>
>> -*-
>>
>> But: There are some remaining places where I can see potential race conditions
>> involving notification. For example:
>>
>> Initial state: _head = 0, _tail = 2, _capacity = 1000
>> Thread 1 push(): int h = _head, t = _tail, nt = next_i(t);
>> => int h = 0, t = 2, nt = 3;
>> => Thread 1 sleeps
>> Thread 2 pull(): (executes once)
>> => _head = 1, _tail = 2
>> Thread 2 pull(): (executes once)
>> => _head = 2, _tail = 2
>> Thread 2 pull(): (executes, returns null, _sleepiness++)
>> Thread 2 pull(): => _empty_note.sleep()
>> Thread 1 push(): => wakes up
>> => executes "if (s == 1) _empty_note.wake();"
>> => But s == 3!
>> => So notifier is not woken, the bug happens.
>>
>> I have attached a diff to address these problems. Does this do anything?
>>
>> Eddie
>>
>>
>>
>
> I can say that your patch has been working 10 hours!!, even though you
> posted it just 10 mins before.
> Because I have been testing patch similar to yours.
>
> I have tried with
> if (s == 1 || s == 2 || s ==3 )
> _empty_note.wake();
>
> And clear git src + your patch is working nicely too!!
>
> Joonwoo Park
More information about the click
mailing list