[Click] Spinlock Problems

Eddie Kohler kohler at cs.ucla.edu
Wed Aug 27 14:21:38 EDT 2008


Kevin,

Is there any chance you'd be able to try the newer Spinlock code any time soon?

Eddie


springbo at cs.wisc.edu wrote:
> Hi,
> 
> I think there is a concurrency issue with Spinlocks in linuxmodule
> multi-threaded click (running a 2.6.19.2 patched kernel, the e1000-NAPI
> driver and today's trunk). I've put together a temporary patch, but there
> might be further issues. Sorry, I don't have the time to investigate more.
> 
> 
> Problems:
> The series of events are:
> -Thread A is running on CPU 0 and thread B is running on CPU 1.
> -A acquires the Spinlock and executes code in the critical section.
> -Linux schedules Thread A to run on CPU 1 and thread B to run on CPU 0.
> -B acquires the Spinlock (works because CPU 0 is the owner of the lock,
> not the thread)
> -A and B are both operating in the critical section
> -(if you're lucky) A releases the Spinlock and generates a “releasing
> someone else's lock” message
> -(if you're unlucky) Oops
> 
> Attached is a test element and a configuration to elicit the behavior. I
> could only replicate the problem when using a polling input source under
> heavy load.
> 
> After fixing the above problem I was still seeing two threads inside the
> CS. For some reason the atomic_uint32_t::swap was not working atomically
> as expected. Changing this to atomic_uint32::compare_and_swap solved the
> problem. We're running x86_64 quad core Xeon CPUs.
> 
> 
> Solution:
> In the patch attached I added a new function click_current_thread() which
> returns a thread_info pointer. Inside of Spinlock I replace
> click_current_processor() uses with click_current_thread(). This way even
> if the thread is assigned to another core it will still have the same
> thread_info pointer.
> 
> The problem with this solution is that it does not work as an index into
> an array nicely and cannot simply replace all uses of
> click_current_processor(). There may be other places in the code which
> make the same incorrect use of click_current_processor(), but a better
> approach will be needed if the value is used to index into an array (such
> as in ReadWriteLock).
> 
> 
> Thanks!
> Kevin Springborn
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> click mailing list
> click at amsterdam.lcs.mit.edu
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click



More information about the click mailing list