[Click] Conversion of threads

Bobby Longpocket bobbylongpocket at yahoo.com
Thu Feb 24 20:03:08 EST 2011


It may not be *the* bug, but if you hit it, it will produce *the* problem. :)

I think the fix will work.



--- On Thu, 2/24/11, Eddie Kohler <kohler at cs.ucla.edu> wrote:

> From: Eddie Kohler <kohler at cs.ucla.edu>
> Subject: Re: [Click] Conversion of threads
> To: "Bobby Longpocket" <bobbylongpocket at yahoo.com>
> Cc: "Cliff Frey" <cliff at meraki.com>, click at pdos.csail.mit.edu
> Date: Thursday, February 24, 2011, 4:23 PM
> Thank you very, very much for this
> very clear explanation.
> 
> That sure is *a* bug (not sure it is *the*).  What do
> you think of the 
> following fix?
> 
> http://www.read.cs.ucla.edu/gitweb?p=click;a=commitdiff;h=0cbe4e8bdfbee6efe127cdd714c145cd80e89af9;hp=968964c3e344c2898681e9644d51b20669fef6cd
> 
> Eddie
> 
> 
> On 02/24/2011 02:48 PM, Bobby Longpocket wrote:
> > The bug is in ThreadSafeQueue.
> >
> > You can't treat head and tail independently in the
> push/pull operations. The result of the current code is that
> you can fill the queue beyond capacity (turning it into an
> 'empty' queue as a result).
> >
> > This happens because a thread can be using a stale
> head and tail, and this can go unnoticed if the current
> value of tail (in push) or head (in pull) has wrapped around
> to the original value.
> >
> > For instance, suppose a thread gets into push with
> _head==3 and _tail==1 (one slot left), then gets delayed for
> a while...
> >
> > ...meanwhile, other pushers and pullers work on the
> queue... head=3;tail=2...head=50;tail=49...head=2;tail=1...
> >
> > ...now our thread gets back to work. 
> _tail/_xtail is 1 again, so we pass the compare-and-swap and
> we'll do our push with next-tail==2 even though h==2.
> >
> >
> >
> >
> > --- On Thu, 2/24/11, Eddie Kohler<kohler at cs.ucla.edu> wrote:
> >
> >> From: Eddie Kohler<kohler at cs.ucla.edu>
> >> Subject: Re: [Click] Conversion of threads
> >> To: "Cliff Frey"<cliff at meraki.com>
> >> Cc: click at pdos.csail.mit.edu
> >> Date: Thursday, February 24, 2011, 10:30 AM
> >> Hey
> >>
> >> I thought I found a bug in ThreadSafeQueue, but
> convinced
> >> myself it is OK.
> >> There could be problems in the threading core
> (which has
> >> been changed for
> >> coreperformance work).  However, I do not
> observe
> >> Click getting stuck.
> >> Rather, I observe it performing radically
> slowly.  The
> >> attached config prints
> >> counter information, etc. every second.  The
> counters
> >> keep going up.  Although
> >> the InfiniteSources often look unscheduled, this
> is because
> >> they *are*
> >> unscheduled much of the time; the
> fast_reschedule() happens
> >> at the end of
> >> InfintieSource::run_task().  I don't know
> what's going
> >> on.  Commenting out all
> >> notification does not help.
> >>
> >> Eddie
> >>
> >>
> >> On 2/24/11 8:12 AM, Cliff Frey wrote:
> >>> I wanted to benchmark the difference between
> >> ThreadSafeQueue and
> >>> one-queue-per-thread + RoundRobinScheduler +
> Unqueue +
> >> another-Queue.
> >>>     However, I ended up
> finding a bug.
> >>>
> >>> This config:
> >>>
> >>> elementclass Foo {
> >>>      is :: InfiniteSource
> ->  [0] output;
> >>> };
> >>>
> f0::Foo,f1::Foo,f2::Foo,f3::Foo,f4::Foo,f5::Foo
> >>> ->  master_q :: ThreadSafeQueue
> >>> ->  uq2 :: Unqueue
> >>> ->  c :: Counter(COUNT_CALL 4000000
> stop)
> >>> ->  Discard;
> >>> StaticThreadSched(
> >>> f0/is 0,
> >>> f1/is 1,
> >>> f2/is 2,
> >>> f3/is 3,
> >>> f4/is 4,
> >>> f5/is 5
> >>> uq2 7);
> >>>
> >>> When run with 'click --threads=8 foo.click'
> ends up
> >> getting suck, with an
> >>> empty master_q and none of the infinitesource
> elements
> >> scheduled.  I'm running
> >>> on a i7 CPU with 4 cores + hyperthreading,
> giving 8
> >> CPUs from linux's perspective.
> >>>
> >>> Also interesting is how many drops will be
> seen at the
> >> queue in this case (not
> >>> too surprising really)
> >>>
> >>> read f0/is.scheduled
> >>> false
> >>> read master_q.length
> >>> 0
> >>> read master_q.drops
> >>> 163540
> >>> read c.count
> >>> 231797
> >>>
> >>> Cliff
> >>>
> >>> On Thu, Feb 24, 2011 at 7:27 AM, Eddie
> Kohler<kohler at cs.ucla.edu
> >>> <mailto:kohler at cs.ucla.edu>>
> >> wrote:
> >>>
> >>>       Try
> ThreadSafeQueue at the
> >> converge point, which as Bobby points out should
> be
> >>>       correct even
> when
> >> multithreaded, but doesn't require locks. 
> However, it
> >> will
> >>>       still be
> slowish, as would be
> >> anything that tried to enforce strict order,
> >>>       since the queue
> pointers will
> >> bounce among cores.
> >>>
> >>>       Eddie
> >>>
> >>>
> >>>       On 2/23/11 9:36
> PM, Philip
> >> Prindeville wrote:
> >>>        >  I was
> wondering... if I'm
> >> running multiple duplicate paths each on its
> >>>       own core and I
> eventually need
> >> to collect them up so I can resequence
> >>>       packets and
> pass them up into
> >> user-space... how feasible is it to go from
> >>>       threaded and
> lockless to
> >> converging into a single locked queue (fan-in)?
> >>>        >
> >>>        >  I figure
> it would be the best
> >> of both worlds because most of the
> >>>       operations
> could happen
> >> threaded and without locking, and the locking
> only
> >>>       needs to be
> enforced in the
> >> final stage...
> >>>        >
> >>>        >  Thanks.
> >>>        >
> >>>        >  -Philip
> >>>        >
> >>>        >
> >> _______________________________________________
> >>>        >  click
> mailing list
> >>>        >  click at amsterdam.lcs.mit.edu
> >> <mailto:click at amsterdam.lcs.mit.edu>
> >>>        >  https://amsterdam.lcs.mit.edu/mailman/listinfo/click
> >>>
> >> 
>    _______________________________________________
> >>>       click mailing
> list
> >>>       click at amsterdam.lcs.mit.edu
> >> <mailto:click at amsterdam.lcs.mit.edu>
> >>>       https://amsterdam.lcs.mit.edu/mailman/listinfo/click
> >>>
> >>>
> >>
> >> -----Inline Attachment Follows-----
> >>
> >> _______________________________________________
> >> click mailing list
> >> click at amsterdam.lcs.mit.edu
> >> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
> >>
> >
> >
> >
> _______________________________________________
> click mailing list
> click at amsterdam.lcs.mit.edu
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
> 


      



More information about the click mailing list