tent='cgit v1.2.3-54-g00ecf'/>
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-02-07virtio_net: factor out xdp handler for readabilityJohn Fastabend1-51/+35
At this point the do_xdp_prog is mostly if/else branches handling the different modes of virtio_net. So remove it and handle running the program in the per mode handlers. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07virtio_net: wrap rtnl_lock in test for calling with lock already heldJohn Fastabend1-10/+21
For XDP use case and to allow ethtool reset tests it is useful to be able to use reset paths from contexts where rtnl lock is already held. This requries updating virtnet_set_queues and free_receive_bufs the two places where rtnl_lock is taken in virtio_net. To do this we use the following pattern, _foo(...) { do stuff } foo(...) { rtnl_lock(); _foo(...); rtnl_unlock()}; this allows us to use freeze()/restore() flow from both contexts. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06Merge branch 'bridge-improve-cache-utilization'David S. Miller11-52/+59
Nikolay Aleksandrov says: ==================== bridge: improve cache utilization This is the first set which begins to deal with the bad bridge cache access patterns. The first patch rearranges the bridge and port structs a little so the frequently (and closely) accessed members are in the same cache line. The second patch then moves the garbage collection to a workqueue trying to improve system responsiveness under load (many fdbs) and more importantly removes the need to check if the matched entry is expired in __br_fdb_get which was a major source of false-sharing. The third patch is a preparation for the final one which If properly configured, i.e. ports bound to CPUs (thus updating "updated" locally) then the bridge's HitM goes from 100% to 0%, but even without binding we get a win because previously every lookup that iterated over the hash chain caused false-sharing due to the first cache line being used for both mac/vid and used/updated fields. Some results from tests I've run: (note that these were run in good conditions for the baseline, everything ran on a single NUMA node and there were only 3 fdbs) 1. baseline 100% Load HitM on the fdbs (between everyone who has done lookups and hit one of the 3 hash chains of the communicating src/dst fdbs) Overall 5.06% Load HitM for the bridge, first place in the list 2. patched & ports bound to CPUs 0% Local load HitM, bridge is not even in the c2c report list Also there's 3% consistent improvement in netperf tests. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: fdb: write to used and updated at most once per jiffyNikolay Aleksandrov2-2/+4
Writing once per jiffy is enough to limit the bridge's false sharing. After this change the bridge doesn't show up in the local load HitM stats. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: move write-heavy fdb members in their own cache lineNikolay Aleksandrov1-4/+6
Fdb's used and updated fields are written to on every packet forward and packet receive respectively. Thus if we are receiving packets from a particular fdb, they'll cause false-sharing with everyone who has looked it up (even if it didn't match, since mac/vid share cache line!). The "used" field is even worse since it is updated on every packet forward to that fdb, thus the standard config where X ports use a single gateway results in 100% fdb false-sharing. Note that this patch does not prevent the last scenario, but it makes it better for other bridge participants which are not using that fdb (and are only doing lookups over it). The point is with this move we make sure that only communicating parties get the false-sharing, in a later patch we'll show how to avoid that too. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: move to workqueue gcNikolay Aleksandrov10-23/+29
Move the fdb garbage collector to a workqueue which fires at least 10 milliseconds apart and cleans chain by chain allowing for other tasks to run in the meantime. When having thousands of fdbs the system is much more responsive. Most importantly remove the need to check if the matched entry has expired in __br_fdb_get that causes false-sharing and is completely unnecessary if we cleanup entries, at worst we'll get 10ms of traffic for that entry before it gets deleted. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: modify bridge and port to have often accessed fields in one cache lineNikolay Aleksandrov1-23/+20
Move around net_bridge so the vlan fields are in the beginning since they're checked on every packet even if vlan filtering is disabled. For the port move flags & vlan group to the beginning, so they're in the same cache line with the port's state (both flags and state are checked on each packet). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bpf: enable verifier to add 0 to packet ptrWilliam Tu2-1/+24
The patch fixes the case when adding a zero value to the packet pointer. The zero value could come from src_reg equals type BPF_K or CONST_IMM. The patch fixes both, otherwise the verifer reports the following error: [...] R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=4) R2=pkt_end R3=fp-12 R4=imm4,min_value=4,max_value=4 R5=pkt(id=0,off=4,r=4) 269: (bf) r2 = r0 // r2 becomes imm0 270: (77) r2 >>= 3 271: (bf) r4 = r1 // r4 becomes pkt ptr 272: (0f) r4 += r2 // r4 += 0 addition of negative constant to packet pointer is not allowed Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Mihai Budiu <mbudiu@vmware.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bpf: test for AND edge casesJosef Bacik1-0/+55
These two tests are based on the work done for f23cc643f9ba. The first test is just a basic one to make sure we don't allow AND'ing negative values, even if it would result in a valid index for the array. The second is a cleaned up version of the original testcase provided by Jann Horn that resulted in the commit. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06Merge branch 'dsa-add-fabric-notifier'David S. Miller7-49/+205
Vivien Didelot says: ==================== net: dsa: add fabric notifier When a switch fabric is composed of multiple switch chips, these chips must be programmed accordingly when an event occurred on one of them. Examples of such event include hardware bridging: when a Linux bridge spans interconnected chips, they must be programmed to allow external ports to ingress frames on their internal ports. Another example is cross-chip hardware VLANs. Switch chips in-between interconnected bridge ports must also configure a given VLAN to allow packets to pass through them. In order to support that, this patchset introduces a non-intrusive notifier mechanism. It adds a notifier head in every DSA switch tree (the said fabric), and a notifier block in every DSA switch chip. When an even occurs, it is chained to all notifiers of the fabric. Switch chips can react accordingly if they are cross-chip capable. On a dynamic debug enabled system, bridging a port in a multi-chip fabric will print something like this (ZII Rev B board): # brctl addif br0 lan3 mv88e6085 0.1:00: crosschip DSA port 1.0 bridged to br0 mv88e6085 0.4:00: crosschip DSA port 1.0 bridged to br0 # brctl delif br0 lan3 mv88e6085 0.1:00: crosschip DSA port 1.0 unbridged from br0 mv88e6085 0.4:00: crosschip DSA port 1.0 unbridged from br0 Currently only bridging events are added. A patchset introducing support for cross-chip hardware bridging configuration in mv88e6xxx will follow right after. Then events for switchdev operations are next on the line. We should note that non-switchdev events do not support rolling-back switch-wide operations. We'll have to work on closer integration with switchdev for that, like introducing new attributes or objects, to benefit from the prepare and commit phases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net: dsa: introduce bridge notifierVivien Didelot3-11/+71
A slave device will now notify the switch fabric once its port is bridged or unbridged, instead of calling directly its switch operations. This code allows propagating cross-chip bridging events in the fabric. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net: dsa: add switch notifierVivien Didelot6-0/+77
Add a notifier block per DSA switch, registered against a notifier head in the switch fabric they belong to. This infrastructure will allow to propagate fabric-wide events such as port bridging, VLAN configuration, etc. If a DSA switch driver cares about cross-chip configuration, such events can be caught. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net: dsa: change state setter scopeVivien Didelot1-6/+9
The scope of the functions inside net/dsa/slave.c must be the slave net_device pointer. Change to state setter helper accordingly to simplify callers. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net: dsa: rollback bridging on errorVivien Didelot1-1/+13
When an error is returned during the bridging of a port in a NETDEV_CHANGEUPPER event, net/core/dev.c rolls back the operation. Be consistent and unassign dp->bridge_dev when this happens. In the meantime, add comments to document this behavior. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net: dsa: simplify netdevice events handlingVivien Didelot1-28/+16
Simplify the code handling the slave netdevice notifier call by providing a dsa_slave_changeupper helper for NETDEV_CHANGEUPPER, and so on (only this event is supported at the moment.) Return NOTIFY_DONE when we did not care about an event, and NOTIFY_OK when we were concerned but no error occurred, as the API suggests. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net: dsa: move netdevice notifier registrationVivien Didelot3-10/+26
Move the netdevice notifier block register code in slave.c and provide helpers for dsa.c to register and unregister it. At the same time, check for errors since (un)register_netdevice_notifier may fail. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net/mlx5e: fix another maybe-uninitialized false-positiveArnd Bergmann