[Click] Userlevel performance issues

Robert Ross rross at dsci.com
Thu Feb 7 11:24:21 EST 2008


This seems to be the opposite of our observations though.  In effect,
our FromDump appeared to dominate processing at the cost of ToDevice
performance.  The FromDump element always introduced traffic at the
desired rate and timing.  ToDevice, however, dropped to a very low pull
rate, essentially causing queues to immediately fill up and drop.

On Thu, 2008-02-07 at 10:35, Eddie Kohler wrote:

> It would have affected kernel mode as well.  But the problem was not strictly 
> a PERFORMANCE problem, Click still performed well for them, it just took 100% 
> cpu.  Your FromDump -> ... might have been affected because FromDump can use 
> timers in some configs, and thus would take lower priority than busy waiting.
> 
> Eddie
> 
> 
> Robert Ross wrote:
> > Do you know if this problem would affect kernel-mode performance as 
> > well, or was this isolated to userlevel only?
> > 
> > On Tue, 2008-02-05 at 17:41, Eddie Kohler wrote:
> >> /Hi Robert,
> >>
> >> I wonder if your observed weirdness with LinkUnqueue was due to the 
> >> 100%-CPU-on-DelayUnqueue problem recently reported.  Maybe if you tried the 
> >> configuration now?
> >>
> >> Eddie
> >>
> >>
> >> Robert Ross wrote:
> >> > I'm not sure what this means, but we have been able to completely avoid
> >> > this problem by using kernel-level Click with the experimental
> >> > FromUserDevice, and a user-level click reading FromDump and pushing
> >> > packets out on a custom ToRawFile element.
> >> > 
> >> > I will gladly put together and test a simple configuration.  It would be
> >> > identical to the configuration I had attached except for switching the
> >> > Socket() to a FromDump().  I will run some more tests and send you the
> >> > monitor.csv output from our script elements.  
> >> > 
> >> > BTW, we used the monitor.csv output file in tandem with the Java-based
> >> > LiveGraph to see real-time statistics on Click performance.  You can
> >> > also use Livegraph after the fact to open up and view our Monitor.csv
> >> > file on your end once I send you output.  It has been a very nice
> >> > marriage of capabilities for real-time analysis with minimal coding.
> >> > We've done something similar in kernel-level, but had to write a custom
> >> > java application to output the monitor.csv since kernel configurations
> >> > cannot output directly to files.
> >> > 
> >> > 
> >> > Robert Ross
> >> > DSCI Inc.
> >> > Office: 732.542.3113 x173
> >> > Home: 609.702.8114
> >> > Cell: 609.509.5139
> >> > Fax: 253.550.6198
> >> > 
> >> > -----Original Message-----
> >> > From: Eddie Kohler [mailto:kohler at cs.ucla.edu] 
> >> > Sent: Tuesday, January 29, 2008 2:39 PM
> >> > To: Robert Ross
> >> > Cc: Beyers Cronje; click at amsterdam.lcs.mit.edu
> >> > Subject: Re: [Click] Userlevel performance issues
> >> > 
> >> > Hi Robert,
> >> > 
> >> > The *job* of LinkUnqueue is specifically to throttle performance.  It is
> >> > designed to output packets at the bandwidth specified.  This will cause
> >> > a lower rate, pinned to that bandwidth!
> >> > 
> >> > The numbers you report are kind of reasonable.  Click parses bandwidths
> >> > as powers of 10, which is the networking standard as far as I can tell.
> >> > So 512Kbps = 512000bps = 64000Bps; 190p/s at this rate implies 336B
> >> > packets.  So 1360p/s, for your highest bandwidth LinkUnqueue, assuming
> >> > the same packet length, is roughly half what it "should" be.  That's not
> >> > great, but it's not terrible.
> >> > 
> >> > I have not run your configuration with Sockets, but I have with
> >> > InfiniteSources, and so forth, and have observed LinkUnqueue outputing
> >> > packets at the correct rate.  In fact I checked in an update to Counter,
> >> > to give it bit_rate and byte_rate handlers, making this easier to see.
> >> > 
> >> > LinkUnqueue should affect the upstream Socket elements only indirectly. 
> >> > LinkUnqueue stops pulling from its input when the emulated link is full.
> >> > This will cause an upstream Queue to fill up.  Some elements might
> >> > notice that Queue's full state and stop producing packets (since those
> >> > packets will only be dropped).  The InfiniteSource and user-level
> >> > FromHost elements have this behavior.  However, your use of
> >> > NotifierQueue (instead of Queue) would neutralize this effect, since
> >> > NotifierQueue doesn't provide full notification.
> >> > 
> >> > I am unsure in the end whether you are observing a bug or correct
> >> > behavior. 
> >> > Here are a couple questions to help us figure it out.
> >> > 
> >> > - Re: FromDump and ToDevice.  Can you reduce the configuration as much
> >> > as possible, and tell us what rates ToDevice achieves without FromDump,
> >> > and what it achieves with FromDump?  Your mail isn't specific about the
> >> > configuration or the performance numbers.
> >> > 
> >> > - Re: LinkUnqueue.  Can you send the output of your configuration (cool
> >> > use of define and Script btw), as well as the configuration?  Again,
> >> > with InfiniteSource I see expected behavior, and I would not expect
> >> > LinkUnqueue to throttle Socket.
> >> > 
> >> > It may be that you are finding an unfortunate interaction between
> >> > Click's task handlers and its file descriptor handlers -- something we
> >> > could potentially fix.  But without specific numbers it's hard to tell.
> >> > 
> >> > Eddie
> >> > 
> >> > 
> >> > Robert Ross wrote:
> >> >> The only clear item that seems to have a marked difference is the 
> >> >> LinkUnqueue element.  The fact that our ToDevice and FromDevice/Socket
> >> > 
> >> >> performance appears to be related somehow to the configuration of a 
> >> >> LinkUnqueue element sitting in the middle of our configuration is too 
> >> >> obvious to ignore.  Does LinkUnqueue perform some kind of 
> >> >> upstream/downstream notification to these elements, causing them to 
> >> >> throttle their behavior based on LinkUnqueue?
> >> >>  
> >> >> In our tests, with all other elements remaining the same, here is what
> >> > 
> >> >> we found from two independent read handler counts:
> >> >>  
> >> >> LinkUnqueue("512Kbps") = Maximum ~190 packets/second pushed from the 
> >> >> Socket element and pulled by the ToDevice element
> >> >> LinkUnqueue("1Mbps") = Maxmum ~290 packets/second pushed from the 
> >> >> Socket element and pulled by the ToDevice element
> >> >> LinkUnqueue("2Mbps") = Maximum ~490 packets/second pushed from the 
> >> >> Socket element and pulled by the ToDevice element
> >> >> LinkUnqueue("4Mbps") = Maximum ~780 packets/second pushed from the 
> >> >> Socket element and pulled by the ToDevice element
> >> >> LinkUnqueue("6Mbps") = Maximum ~980 packets/second pushed from the 
> >> >> Socket element and pulled by the ToDevice element
> >> >> LinkUnqueue("8Mbps") = Maximum ~1360 packets/second pushed from the 
> >> >> Socket element and pulled by the ToDevice element
> >> >>  
> >> >> It is also telling that independant handler counters corroborate 
> >> >> exactly the same maximum packets per second in two very different 
> >> >> places in the configuration.  Clearly you can see that the limitation 
> >> >> on processing is completely artificial and not an actual performance 
> >> >> problem, since increasing LinkUnqueue increases the performance in a 
> >> >> very controlled and obvious manner.
> >> >>  
> >> >> I have attached a simple configuration that examines specific handlers
> >> > 
> >> >> and outputs values each second to a CSV file for analysis.  The 
> >> >> configuration is scaled back to complete simplicity, yet has the same 
> >> >> performance as our actual configuration which has a much more 
> >> >> complicated configuration.  Nevertheless, the performance is identical
> >> > 
> >> >> and seems to point squarely at LinkUnqueue.
> >> >>  
> >> >> What is LinkUnqueue doing that could be causing this type of effect on
> >> > 
> >> >> FromHost, Socket and ToDevice?
> >> >>
> >> >>
> >> >> ________________________________
> >> >>
> >> >> From: Robert Ross
> >> >> Sent: Friday, January 25, 2008 7:40 PM
> >> >> To: 'Beyers Cronje'
> >> >> Cc: click at pdos.csail.mit.edu
> >> >> Subject: RE: [Click] Userlevel performance issues
> >> >>
> >> >>
> >> >> Sorry, I wasn't clear that the queues are necessary for our 
> >> >> configuration.  The configuration is somewhat complex.  I was only 
> >> >> attempting to highlight the important parts.
> >> >>  
> >> >>  
> >> >>
> >> >>
> >> >> ________________________________
> >> >>
> >> >> From: Beyers Cronje [mailto:bcronje at gmail.com]
> >> >> Sent: Friday, January 25, 2008 7:31 PM
> >> >> To: Robert Ross
> >> >> Cc: click at pdos.csail.mit.edu
> >> >> Subject: Re: [Click] Userlevel performance issues
> >> >>
> >> >>
> >> >> Hi Robert,
> >> >>
> >> >>
> >> >>  
> >> >>
> >> >> 	*       We first found that when UserLevel Click started pulling
> >> >> from a
> >> >> 	PCAP file, the performance of the ToDevice() appeared to drop 
> >> >> sharply.
> >> >> 	What I mean by this is that the ToDevice() pull handler reported
> >> > 
> >> >> values
> >> >> 	in the range of 200 packets/second once the PCAP file started 
> >> >> reading.
> >> >> 	This resulted in the outbound queue just prior to the ToDevice()
> >> > 
> >> >> filling
> >> >> 	up and eventually overflowing because the packet rate in the
> >> > PCAP 
> >> >> file
> >> >> 	is far more than 200 packets/second.
> >> >>
> >> >>
> >> >> You dont have to use a queue between FromDump and ToDevice as FromDump
> >> > 
> >> >> is an agnostic element. In other words you can connect Todevice 
> >> >> directly to FromDump which should ensure that at least no packets are 
> >> >> dropped and you should see best ToDevice performance.
> >> >>
> >> >> Also there are a few tuning parameters. Try tuning your NIC TX Ring 
> >> >> size. On the e1000 driver the default TX ring size is 256, experiment 
> >> >> with different value to see if it makes a difference.ToDevice uses a 
> >> >> packet socket from transmit, so it might be worth experimenting with 
> >> >> /proc/sys/net/core/wmem_default /proc/sys/net/core/wmem_max
> >> >>
> >> >>
> >> >> Beyers
> >> >>
> >> >>
> >> >>
> >> >> ----------------------------------------------------------------------
> >> >> --
> >> >>
> >> >> _______________________________________________
> >> >> click mailing list
> >> >> click at amsterdam.lcs.mit.edu
> >> >> //_https://amsterdam.lcs.mit.edu/mailman/listinfo/click_/


More information about the click mailing list