[Click] Conversion of threads
Eddie Kohler
kohler at cs.ucla.edu
Thu Feb 24 19:23:30 EST 2011
Thank you very, very much for this very clear explanation.
That sure is *a* bug (not sure it is *the*). What do you think of the
following fix?
http://www.read.cs.ucla.edu/gitweb?p=click;a=commitdiff;h=0cbe4e8bdfbee6efe127cdd714c145cd80e89af9;hp=968964c3e344c2898681e9644d51b20669fef6cd
Eddie
On 02/24/2011 02:48 PM, Bobby Longpocket wrote:
> The bug is in ThreadSafeQueue.
>
> You can't treat head and tail independently in the push/pull operations. The result of the current code is that you can fill the queue beyond capacity (turning it into an 'empty' queue as a result).
>
> This happens because a thread can be using a stale head and tail, and this can go unnoticed if the current value of tail (in push) or head (in pull) has wrapped around to the original value.
>
> For instance, suppose a thread gets into push with _head==3 and _tail==1 (one slot left), then gets delayed for a while...
>
> ...meanwhile, other pushers and pullers work on the queue... head=3;tail=2...head=50;tail=49...head=2;tail=1...
>
> ...now our thread gets back to work. _tail/_xtail is 1 again, so we pass the compare-and-swap and we'll do our push with next-tail==2 even though h==2.
>
>
>
>
> --- On Thu, 2/24/11, Eddie Kohler<kohler at cs.ucla.edu> wrote:
>
>> From: Eddie Kohler<kohler at cs.ucla.edu>
>> Subject: Re: [Click] Conversion of threads
>> To: "Cliff Frey"<cliff at meraki.com>
>> Cc: click at pdos.csail.mit.edu
>> Date: Thursday, February 24, 2011, 10:30 AM
>> Hey
>>
>> I thought I found a bug in ThreadSafeQueue, but convinced
>> myself it is OK.
>> There could be problems in the threading core (which has
>> been changed for
>> coreperformance work). However, I do not observe
>> Click getting stuck.
>> Rather, I observe it performing radically slowly. The
>> attached config prints
>> counter information, etc. every second. The counters
>> keep going up. Although
>> the InfiniteSources often look unscheduled, this is because
>> they *are*
>> unscheduled much of the time; the fast_reschedule() happens
>> at the end of
>> InfintieSource::run_task(). I don't know what's going
>> on. Commenting out all
>> notification does not help.
>>
>> Eddie
>>
>>
>> On 2/24/11 8:12 AM, Cliff Frey wrote:
>>> I wanted to benchmark the difference between
>> ThreadSafeQueue and
>>> one-queue-per-thread + RoundRobinScheduler + Unqueue +
>> another-Queue.
>>> However, I ended up finding a bug.
>>>
>>> This config:
>>>
>>> elementclass Foo {
>>> is :: InfiniteSource -> [0] output;
>>> };
>>> f0::Foo,f1::Foo,f2::Foo,f3::Foo,f4::Foo,f5::Foo
>>> -> master_q :: ThreadSafeQueue
>>> -> uq2 :: Unqueue
>>> -> c :: Counter(COUNT_CALL 4000000 stop)
>>> -> Discard;
>>> StaticThreadSched(
>>> f0/is 0,
>>> f1/is 1,
>>> f2/is 2,
>>> f3/is 3,
>>> f4/is 4,
>>> f5/is 5
>>> uq2 7);
>>>
>>> When run with 'click --threads=8 foo.click' ends up
>> getting suck, with an
>>> empty master_q and none of the infinitesource elements
>> scheduled. I'm running
>>> on a i7 CPU with 4 cores + hyperthreading, giving 8
>> CPUs from linux's perspective.
>>>
>>> Also interesting is how many drops will be seen at the
>> queue in this case (not
>>> too surprising really)
>>>
>>> read f0/is.scheduled
>>> false
>>> read master_q.length
>>> 0
>>> read master_q.drops
>>> 163540
>>> read c.count
>>> 231797
>>>
>>> Cliff
>>>
>>> On Thu, Feb 24, 2011 at 7:27 AM, Eddie Kohler<kohler at cs.ucla.edu
>>> <mailto:kohler at cs.ucla.edu>>
>> wrote:
>>>
>>> Try ThreadSafeQueue at the
>> converge point, which as Bobby points out should be
>>> correct even when
>> multithreaded, but doesn't require locks. However, it
>> will
>>> still be slowish, as would be
>> anything that tried to enforce strict order,
>>> since the queue pointers will
>> bounce among cores.
>>>
>>> Eddie
>>>
>>>
>>> On 2/23/11 9:36 PM, Philip
>> Prindeville wrote:
>>> > I was wondering... if I'm
>> running multiple duplicate paths each on its
>>> own core and I eventually need
>> to collect them up so I can resequence
>>> packets and pass them up into
>> user-space... how feasible is it to go from
>>> threaded and lockless to
>> converging into a single locked queue (fan-in)?
>>> >
>>> > I figure it would be the best
>> of both worlds because most of the
>>> operations could happen
>> threaded and without locking, and the locking only
>>> needs to be enforced in the
>> final stage...
>>> >
>>> > Thanks.
>>> >
>>> > -Philip
>>> >
>>> >
>> _______________________________________________
>>> > click mailing list
>>> > click at amsterdam.lcs.mit.edu
>> <mailto:click at amsterdam.lcs.mit.edu>
>>> > https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>
>> _______________________________________________
>>> click mailing list
>>> click at amsterdam.lcs.mit.edu
>> <mailto:click at amsterdam.lcs.mit.edu>
>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>
>>>
>>
>> -----Inline Attachment Follows-----
>>
>> _______________________________________________
>> click mailing list
>> click at amsterdam.lcs.mit.edu
>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>
>
>
>
More information about the click
mailing list