[Click] Poor forwarding performance with click in a paravirtualized Xen environment

Luigi Rizzo rizzo at iet.unipi.it
Tue Mar 19 14:29:00 EDT 2013


On Tue, Mar 19, 2013 at 05:49:56PM +0100, Richard Neumann wrote:
> Hello folks,
> 
> I have encountered the problem that click is scaling quite badly inside
> a Xen dom-U paravirtualized environment.
> 
> The setup is like:
> 
> [Sender] --10G-Link--> [Router] --10G-Link--> [Receiver]
> 
> The router is configured as a virtual machine as follows:

    (summary: only a handful of packets per second when [Router] is
    a userspace click configuration, and about 500Kpps when you
    [Router] is a linux bridge. Details at the end)

At first sight it seems a case of receive livelock, which does not occur
so badly with the linux bridge as it (probably) is helped by NAPI.

It is not clear if your userspace click is also using netmap (which
may have its own bugs) and/or whether the two interfaces eth1 and
eth2 are using pci-passthrough or are emulated/paravirtualized.

In any case the 530Kpps you get with linux bridge is probably close
to the peak performance you can get, unless (a) Xen gives you access
to the real hw (via virtual functions and/or pci-passthrough), and
(b) you are using netmap or some very fast os stack bypass to talk
to the network interfaces within the virtual machine.

I managed to go quite fast within qemu+kvm, see below

    http://info.iet.unipi.it/~luigi/netmap/talk-google-2013.html
    http://info.iet.unipi.it/~luigi/papers/20130206-qemu.pdf

but that needed a little bit of tweaking here and there
(not that i doubt that the Xen folks are smart enough to do
something similar).

    cheers
    luigi

> ### FILE /etc/xen/vm/opensuse12-1 ###
> name="opensuse12-1"
> description="None"
> uuid="6e2e1ffe-2faf-f3f4-bfc2-d12a76c99093"
> memory=2048
> maxmem=2048
> vcpus=1
> on_poweroff="destroy"
> on_reboot="restart"
> on_crash="destroy"
> localtime=0
> keymap="de"
> builder="linux"
> bootloader="/usr/bin/pygrub"
> bootargs=""
> extra="xencons=tty "
> disk=[ 'file:/var/lib/xen/images/opensuse12-1/xvda,xvda,w', ]
> vif=[ 'mac=00:16:3e:4f:b4:7c,bridge=br0', ]
> pci=['02:00.0', '02:00.1']
> nographic=1
> ### EOF ###
> 
> Where the two PCI devices are the 10-G networking cards connected to the
> sender, respectively to the receiver.
> 
> Inside the virtual machine, I run click (v. 2.0.1) in user mode with a
> trivial forwarding configuration:
> 
> FromDevice(eth2) -> c1::AverageCounter() -> Queue(1000000) ->
> ToDevice(eth1);
> 
> where eth2 is the interface connected to the sender and eth1 is the
> interface connected to the receiver.
> 
> On the receiver I then run pkt-gen from netmap
> (http://info.iet.unipi.it/~luigi/netmap/) in receiver mode and on the
> sender I run pkt-gen in sender mode with the following results:
> 
> Sender:
> 
> /usr/src/netmap-linux/net/netmap/pkt-gen -i eth7 -t 500111222 -l 64 -w 5
> main [874] map size is 203064 Kb
> main [906] mmapping 203064 Kbytes
> Sending on eth7: 6 queues, 1 threads, 1 cpus and affinity -1.
> 10.0.0.1 -> 10.1.0.1 (00:1b:21:d5:6f:ec -> ff:ff:ff:ff:ff:ff)
> main [953] Wait 5 secs for phy reset
> main [955] Ready...
> main [1074] 14141927 pps
> [...]
> main [1074] 11500703 pps
> main [1074] 3953080 pps
> Sent 500111222 packets, 64 bytes each, in 41.34 seconds.
> Speed: 12.10Mpps. Bandwidth: 6.19Gbps.
> 
> 
> Receiver:
> 
> pkt-gen -i eth7
> main [877] map size is 203064 Kb
> main [909] mmapping 203064 Kbytes
> Receiving from eth7: 6 queues, 1 threads, 1 cpus and affinity -1.
> main [956] Wait 2 secs for phy reset
> main [958] Ready...
> main [1077] 0 pps
> receiver_body [624] waiting for initial packets, poll returns 0 0
> main [1077] 0 pps
> receiver_body [624] waiting for initial packets, poll returns 0 0
> main [1077] 0 pps
> receiver_body [624] waiting for initial packets, poll returns 0 0
> receiver_body [624] waiting for initial packets, poll returns 0 0
> main [1077] 0 pps
> main [1077] 112 pps
> [...]
> main [1077] 248 pps
> main [1077] 0 pps
> Received 4780 packets, in 41.35 seconds.
> Speed: 115.61pps.
> 
> If I, instead of click, use a simple linux bridge (using brctl), I get a
> much higher throughput on the receiver:
> 
> pkt-gen -i eth7
> main [877] map size is 203064 Kb
> main [909] mmapping 203064 Kbytes
> Receiving from eth7: 6 queues, 1 threads, 1 cpus and affinity -1.
> main [956] Wait 2 secs for phy reset
> main [958] Ready...
> main [1077] 540069 pps
> [...]
> main [1077] 526933 pps
> main [1077] 0 pps
> Received 5318241 packets, in 10.00 seconds.
> Speed: 531.60Kpps.
> 
> 
> Does anyone know, why click in my case performs so poorly compared to a
> linux bridge?
> 
> Best regards,
> 
> Richard
> 
> _______________________________________________
> click mailing list
> click at amsterdam.lcs.mit.edu
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click


More information about the click mailing list