[Click] Spinlock Problems
Eddie Kohler
kohler at cs.ucla.edu
Sun Aug 24 12:02:13 EDT 2008
I think that there, strictly speaking, "foo" WOULD need to be declared
volatile. But I think you are right anyway. swap() and compare_and_swap()
gain "memory" constraints.
Eddie
Cliff Frey wrote:
> I really actually think that the "memory" is necessary... or some other
> related thing... otherwise code like
>
> int foo;
> void myfunc() {
> lock()
> foo++;
> unlock()
> lock()
> foo++;
> unlock()
> }
>
> ends up being totally broken... and I believe that there is nothing
> wrong with that code... or do you think that foo needs to be declared
> volatile? But that has its own downsides...
>
>
> here is my real example code... no.. there is no good reason for the
> switch to c++
>
> static inline int the_func(int x, int* foo) {
> asm volatile ("xchgl %0,%1"
> : "=r" (x), "=m" (foo)
> : "0" (x), "m" (foo));
> return x;
> }
>
> class Aoeu {
> int foo;
> //volatile int
> bar;
> Aoeu();
> };
>
> Aoeu::Aoeu() {
> the_func(2, &lock);
> foo++;
> the_func(2, &lock);
>
> the_func(2, &lock);
> foo++;
> the_func(2, &lock);
> }
>
> On Sat, Aug 23, 2008 at 7:50 AM, Eddie Kohler <kohler at cs.ucla.edu
> <mailto:kohler at cs.ucla.edu>> wrote:
>
> Yeah! I have two contrasting thoughts; 1) swap() should do exactly
> what it says and nothing more, one builds robustness from correctly
> and simply specified parts, and 2) you're right, "memory" is
> strictly better. Grr. I'll keep this in my inbox.
>
> Eddie
>
>
> Cliff Frey wrote:
>
> I wonder if the extra "memory" constraint gives you a free
> memory-barrier in some ways? As in, it forces gcc not to
> reorder anything... but I guess maybe the volatile should do
> that for you... but in general I guess the memory might protect
> you from bad code.
>
> example:
>
> static inline int the_func(int x, int* foo) {
> asm volatile ("xchgl %0,%1"
> : "=r" (x), "=m" (foo)
> : "0" (x), "m" (foo));
> return x;
> }
>
> int foo = 1;
> int lock = 1;
> volatile int bar = 0;
> void main() {
> if (foo > 100)
> bar++;
> the_func(2, &lock);
> foo++;
> }
>
>
> if you compile with gcc -O3, the value of foo is cached (and if
> the_func was a lock acquire, it would not have protected you).
> If you add the "memory" constraint, then it does re-read foo as
> part of the increment.
>
> In any case, I can still see making the call either way... but
> the "memory" one does seem slightly safer to me.
>
> Cliff
>
> On Fri, Aug 22, 2008 at 11:29 PM, Eddie Kohler
> <kohler at cs.ucla.edu <mailto:kohler at cs.ucla.edu>
> <mailto:kohler at cs.ucla.edu <mailto:kohler at cs.ucla.edu>>> wrote:
>
> Kevin,
>
> Thanks so much for this bug report!! Apologies, this must have
> sucked to find.
>
>
> - Re: swap() is not atomic: It looks like the inline assembly was
> bad. Mea
> culpa. I had:
> asm volatile ("xchgl %0,%1"
> : "=r" (x), "=m" (CLICK_ATOMIC_VAL));
> But this doesn't tell the compiler to put "x" in the relevant
> register BEFORE
> the instruction. Here's what I have now:
> asm volatile ("xchgl %0,%1"
> : "=r" (x), "=m" (CLICK_ATOMIC_VAL)
> : "0" (x), "m" (CLICK_ATOMIC_VAL));
> Linux goes even further; it says:
> __asm__ __volatile__("xchgl %0,%1"
> :"=r" (x)
> :"m" (*__xg(ptr)), "0" (x)
> :"memory");
> but I don't think the extra "memory" constraint is necessary, as
> long as we
> say that the memory has been modified. Boost appears to agree.
>
> - Re: click_current_processor(). Yes, this is a problem. I
> believe
> that the
> Linux style fix would be to use macros get_cpu() and put_cpu(),
> which turn off
> preemption in between. That way a CPU number remains constant
> while, for
> example, a Spinlock is held. As a bonus Click won't get
> preempted
> while it
> holds a Click Spinlock. My guess is that this is what we
> want; but,
> on the
> other hand, it may be that Click Spinlocks CAN'T
>
>
> I've checked in patches for both these fixes, but these fixes ARE
> NOT TESTED,
> as I don't have immediate access to an SMP box. Any feedback
> would
> be greatly
> appreciated.
>
> Eddie
>
>
> springbo at cs.wisc.edu <mailto:springbo at cs.wisc.edu>
> <mailto:springbo at cs.wisc.edu <mailto:springbo at cs.wisc.edu>> wrote:
> > Hi,
> >
> > I think there is a concurrency issue with Spinlocks in
> linuxmodule
> > multi-threaded click (running a 2.6.19.2 <http://2.6.19.2>
> <http://2.6.19.2>
>
> patched kernel, the e1000-NAPI
> > driver and today's trunk). I've put together a temporary
> patch,
> but there
> > might be further issues. Sorry, I don't have the time to
> investigate more.
> >
> >
> > Problems:
> > The series of events are:
> > -Thread A is running on CPU 0 and thread B is running on
> CPU 1.
> > -A acquires the Spinlock and executes code in the critical
> section.
> > -Linux schedules Thread A to run on CPU 1 and thread B to
> run on
> CPU 0.
> > -B acquires the Spinlock (works because CPU 0 is the owner
> of the
> lock,
> > not the thread)
> > -A and B are both operating in the critical section
> > -(if you're lucky) A releases the Spinlock and generates a
> "releasing
> > someone else's lock" message
> > -(if you're unlucky) Oops
> >
> > Attached is a test element and a configuration to elicit the
> behavior. I
> > could only replicate the problem when using a polling input
> source under
> > heavy load.
> >
> > After fixing the above problem I was still seeing two threads
> inside the
> > CS. For some reason the atomic_uint32_t::swap was not working
> atomically
> > as expected. Changing this to atomic_uint32::compare_and_swap
> solved the
> > problem. We're running x86_64 quad core Xeon CPUs.
> >
> >
> > Solution:
> > In the patch attached I added a new function
> click_current_thread() which
> > returns a thread_info pointer. Inside of Spinlock I replace
> > click_current_processor() uses with
> click_current_thread(). This
> way even
> > if the thread is assigned to another core it will still
> have the same
> > thread_info pointer.
> >
> > The problem with this solution is that it does not work as an
> index into
> > an array nicely and cannot simply replace all uses of
> > click_current_processor(). There may be other places in
> the code
> which
> > make the same incorrect use of click_current_processor(),
> but a
> better
> > approach will be needed if the value is used to index into an
> array (such
> > as in ReadWriteLock).
> >
> >
> > Thanks!
> > Kevin Springborn
> >
> >
> >
>
> ------------------------------------------------------------------------
> >
> > _______________________________________________
> > click mailing list
> > click at amsterdam.lcs.mit.edu
> <mailto:click at amsterdam.lcs.mit.edu>
> <mailto:click at amsterdam.lcs.mit.edu
> <mailto:click at amsterdam.lcs.mit.edu>>
>
> > https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>
> _______________________________________________
> click mailing list
> click at amsterdam.lcs.mit.edu
> <mailto:click at amsterdam.lcs.mit.edu>
> <mailto:click at amsterdam.lcs.mit.edu
> <mailto:click at amsterdam.lcs.mit.edu>>
>
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>
>
>
More information about the click
mailing list