summaryrefslogtreecommitdiff
path: root/proto_ipv6_routing.c
diff options
context:
space:
mode:
Diffstat (limited to 'proto_ipv6_routing.c')
0 files changed, 0 insertions, 0 deletions
pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:200::/120 metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 ... 2. Route Add - one notification with RTA_MULTIPATH attribute $ ip -6 ro add vrf red 2001:db8:200::/120 nexthop via 2001:db8:1::2 nexthop via 2001:db8:2::2 $ ip mon route 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 2. Route Replace - one notification with RTA_MULTIPATH attribute $ ip -6 ro replace vrf red 2001:db8:200::/120 nexthop via 2001:db8:1::16 nexthop via 2001:db8:2::16 $ ip mon route Replaced 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::16 dev eth1 weight 1 nexthop via 2001:db8:2::16 dev eth2 weight 1 - on a failure after the insertion of the first nexthop (which means the original route has been replaced in the FIB), a notification is sent with the successful nexthops and then the nexthops are deleted with one notification per hop. This is consistent with how it works today except the successful additions are coalesced into 1 notification. 3. Route Delete - delete of entire multipath route using prefix/length only 1 notification is generated: $ ip -6 ro del vrf red 2001:db8:200::/120 $ ip mon route Deleted 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::16 dev eth1 weight 1 nexthop via 2001:db8:2::16 dev eth2 weight 1 - if a delete request contains nexthops one notification is generated per nexthop deleted. This is unavoidable since IPv6 alllows a single nexthop to be deleted within a multipath route 4. Route Appends - IPv6 allows nexthops to be appended to an existing route. In this case one notification is sent for the new route with the append flag set. $ ip -6 ro append vrf red 2001:db8:200::/120 nexthop via 2001:db8:2::20 nexthop via 2001:db8:1::20 $ ip mon route Append 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 nexthop via 2001:db8:2::20 dev eth2 weight 1 nexthop via 2001:db8:1::20 dev eth1 weight 1 - on failure of an append, a notification is sent with the route containing all of the nexthops successfully added, and it is followed by delete notifications as the hops are removed returning the route to its prior state. This is consistent with how it works today except the successful additions are coalesced into 1 notification. Addresses some of the inconsistencies also noted by Roopa at netdev0.1: https://www.netdev01.org/docs/prabhu-linux_ipv4_ipv6_inconsistencies_talk_slides.pdf v4 - changed series to do encoding in 1 patch and updating notificatons in separate patches to make it easier to review and understand - 1 notification for delete when using prefix/length; 1 notification for append - handle delete of a single nexthop without RTA_MULTIPATH in delete request - upated commit messages and cover letter v3 - removed the need for a user API to opt-in to change. Requiring an API just shifts the difference from same API with different behavior to different API to achieve equivalent behavior - route notifications changed to use RTA_MULTIPATH for add and replace - upated commit messages and cover letter v2 - fixed locking in patch 1 as noted by DaveM - changed user API for patch 2 to require an rtmsg with RTM_F_ALL_NEXTHOPS set in rtm_flags - revamped explanation of patch 2 and cover letter ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-04net: ipv6: Use compressed IPv6 addresses showing route replace errorDavid Ahern1-1/+1 ip6_print_replace_route_err logs an error if a route replace fails with IPv6 addresses in the full format. e.g,: IPv6: IPV6: multipath route replace failed (check consistency of installed routes): 2001:0db8:0200:0000:0000:0000:0000:0000 nexthop 2001:0db8:0001:0000:0000:0000:0000:0016 ifi 0 Change the message to dump the addresses in the compressed format. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-04net: ipv6: Change notifications for multipath delete to RTA_MULTIPATHDavid Ahern2-1/+28 If an entire multipath route is deleted using prefix and len (without any nexthops), send a single RTM_DELROUTE notification with the full route using RTA_MULTIPATH. This is done by generating the skb before the route delete when all of the sibling routes are still present but sending it after the route has been removed from the FIB. The skip_notify flag is used to tell the lower fib code not to send notifications for the individual nexthop routes. If a route is deleted using RTA_MULTIPATH for any nexthops or a single nexthop entry is deleted, then the nexthops are deleted one at a time with notifications sent as each hop is deleted. This is necessary given that IPv6 allows individual hops within a route to be deleted. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-04net: ipv6: Change notifications for multipath add to RTA_MULTIPATHDavid Ahern3-3/+54 Change ip6_route_multipath_add to send one notifciation with the full route encoded with RTA_MULTIPATH instead of a series of individual routes. This is done by adding a skip_notify flag to the nl_info struct. The flag is used to skip sending of the notification in the fib code that actually inserts the route. Once the full route has been added, a notification is generated with all nexthops. ip6_route_multipath_add handles 3 use cases: new routes, route replace, and route append. The multipath notification generated needs to be consistent with the order of the nexthops and it should be consistent with the order in a FIB dump which means the route with the first nexthop needs to be used as the route reference. For the first 2 cases (new and replace), a reference to the route used to send the notification is obtained by saving the first route added. For the append case, the last route added is used to loop back to its first sibling route which is the first nexthop in the multipath route. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-04net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH attributeDavid Ahern2-17/+105 IPv6 returns multipath routes as a series of individual routes making their display and handling by userspace different and more complicated than IPv4, putting the burden on the user to see that a route is part of a multipath route and internally creating a multipath route if desired (e.g., libnl does this as of commit 29b71371e764). This patch addresses this difference, allowing multipath routes to be returned using the RTA_MULTIPATH attribute. The end result is that IPv6 multipath routes can be treated and displayed in a format similar to IPv4: $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:200::/120 metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-04net: ipv6: Allow shorthand delete of all nexthops in multipath routeDavid Ahern2-3/+39 IPv4 allows multipath routes to be deleted using just the prefix and length. For example: $ ip ro ls vrf red unreachable default metric 8192 1.1.1.0/24 nexthop via 10.100.1.254 dev eth1 weight 1 nexthop via 10.11.200.2 dev eth11.200 weight 1 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3 $ ip ro del 1.1.1.0/24 vrf red $ ip ro ls vrf red unreachable default metric 8192 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3 The same notation does not work with IPv6 because of how multipath routes are implemented for IPv6. For IPv6 only the first nexthop of a multipath route is deleted if the request contains only a prefix and length. This leads to unnecessary complexity in userspace dealing with IPv6 multipath routes. This patch allows all nexthops to be deleted without specifying each one in the delete request. Internally, this is done by walking the sibling list of the route matching the specifications given (prefix, length, metric, protocol, etc). $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:200::/120 via 2001:db8:1::2 dev eth1 metric 1024 pref medium 2001:db8:200::/120 via 2001:db8:2::2 dev eth2 metric 1024 pref medium ... $ ip -6 ro del vrf red 2001:db8:200::/120 $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium ... Because IPv6 allows individual nexthops to be deleted without deleting the entire route, the ip6_route_multipath_del and non-multipath code path (ip6_route_del) have to be discriminated so that all nexthops are only deleted for the latter case. This is done by making the existing fc_type in fib6_config a u16 and then adding a new u16 field with fc_delete_all_nh as the first bit. Suggested-by: Dinesh Dutt <ddutt@cumulusnetworks.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-04virtio_net: exploit napi_complete_done() return valueEric Dumazet1-5/+6 Since commit 364b6055738b ("net: busy-poll: return busypolling status to drivers"), napi_complete_done() returns a boolean that can be used by drivers to conditionally rearm interrupts. This patch changes virtio_net to use this boolean to avoid a bit of overhead for busy-poll users. Jason reports about 1.1% improvement for 1 byte TCP_RR (burst 100). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-04Merge branch '40GbE' of ↵David S. Miller