summaryrefslogtreecommitdiff
path: root/net/ipv6
diff options
context:
space:
mode:
authorDavid S. Miller <davem@davemloft.net>2017-02-03 15:21:23 -0500
committerDavid S. Miller <davem@davemloft.net>2017-02-03 15:21:23 -0500
commit3b19860c7f3d0c184385055cebded19bc537ff4a (patch)
tree48288a17219c037942267db3c6c469cf47bf7ecb /net/ipv6
parent5a0fd98b7b5be8773c53c40c47451ec6cd11d1ff (diff)
parent11538d039ac6efcf4f1a6c536e1b87cd3668a9fd (diff)
Merge branch 'bridge-per-vlan-dst_metadata-support'
Roopa Prabhu says: ==================== bridge: per vlan dst_metadata support High level summary: lwt and dst_metadata have enabled vxlan l3 deployments to use a single vxlan netdev for multiple vnis eliminating the scalability problem with using a single vxlan netdev per vni. This series tries to do the same for vxlan netdevs in pure l2 bridged networks. Use-case/deployment and details are below. Deployment scerario details: As we know VXLAN is used to build layer 2 virtual networks across the underlay layer3 infrastructure. A VXLAN tunnel endpoint (VTEP) originates and terminates VXLAN tunnels. And a VTEP can be a TOR switch or a vswitch in the hypervisor. This patch series mainly focuses on the TOR switch configured as a Vtep. Vxlan segment ID (vni) along with vlan id is used to identify layer 2 segments in a vxlan overlay network. Vxlan bridging is the function provided by Vteps to terminate vxlan tunnels and map the vxlan vni to traditional end host vlan. This is covered in the "VXLAN Deployment Scenarios" in sections 6 and 6.1 in RFC 7348. To provide vxlan bridging function, a vtep has to map vlan to a vni. The rfc says that the ingress VTEP device shall remove the IEEE 802.1Q VLAN tag in the original Layer 2 packet if there is one before encapsulating the packet into the VXLAN format to transmit it through the underlay network. The remote VTEP devices have information about the VLAN in which the packet will be placed based on their own VLAN-to-VXLAN VNI mapping configurations. Existing solution: Without this patch series one can deploy such a vtep configuration by adding the local ports and vxlan netdevs into a vlan filtering bridge. The local ports are configured as trunk ports carrying all vlans. A vxlan netdev per vni is added to the bridge. Vlan mapping to vni is achieved by configuring the vlan as pvid on the corresponding vxlan netdev. The vxlan netdev only receives traffic corresponding to the vlan it is mapped to. This configuration maps traffic belonging to a vlan to the corresponding vxlan segment. ----------------------------------- | bridge | | | ----------------------------------- |100,200 |100 (pvid) |200 (pvid) | | | swp1 vxlan1000 vxlan2000 This provides the required vxlan bridging function but poses a scalability problem with using a separate vxlan netdev for each vni. Solution in this patch series: The Goal is to use a single vxlan device to carry all vnis similar to the vxlan collect metadata mode but additionally allowing the bridge and vxlan driver to carry all the forwarding information and also learn. This implementation uses the existing dst_metadata infrastructure to map vlan to a tunnel id. - vxlan driver changes: - enable collect metadata mode to be used with learning, replication and fdb - A single fdb table hashed by (mac, vni) - rx path already has the vni - tx path expects a vni in the packet with dst_metadata and relies on learnt or static forwarding information table to forward the packet - Bridge driver changes: per vlan dst_metadata support: - Our use case is vxlan and 1-1 mapping between vlan and vni, but I have kept the api generic for any tunnel info - Uapi to configure/unconfigure/dump per vlan tunnel data - new bridge port flag to turn this feature on/off. off by default - ingress hook: - if port is a tunnel port, use tunnel info in attached dst_metadata to map it to a local vlan - egress hook: - if port is a tunnel port, use tunnel info attached to vlan to set dst_metadata on the skb Other approaches tried and vetoed: - tc vlan push/pop and tunnel metadata dst: - though tc can be used to do part of this, these patches address a deployment case where bridge driver vlan filtering and forwarding information database along with vxlan driver forwarding information table and learning are required. - making vxlan driver understand vlan-vni mapping: - I had a series almost ready with this one but soon realized it duplicated a lot of vlan handling code in the vxlan driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv6')
0 files changed, 0 insertions, 0 deletions
-by: David S. Miller <davem@davemloft.net> 2017-02-03tcp: add tcp_mss_clamp() helperEric Dumazet3-19/+7 Small cleanup factorizing code doing the TCP_MAXSEG clamping. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-02net: add LINUX_MIB_PFMEMALLOCDROP counterEric Dumazet1-0/+1 Debugging issues caused by pfmemalloc is often tedious. Add a new SNMP counter to more easily diagnose these problems. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Josef Bacik <jbacik@fb.com> Acked-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-02net: ipv4: remove fib_lookup.h from devinet.c include listDavid Ahern1-2/+0 nothing in devinet.c relies on fib_lookup.h; remove it from the includes Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-2/+4 All merge conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net> 2017-02-02netfilter: allow logging from non-init namespacesMichal Kubeček2-2/+2 Commit 69b34fb996b2 ("netfilter: xt_LOG: add net namespace support for xt_LOG") disabled logging packets using the LOG target from non-init namespaces. The motivation was to prevent containers from flooding kernel log of the host. The plan was to keep it that way until syslog namespace implementation allows containers to log in a safe way. However, the work on syslog namespace seems to have hit a dead end somewhere in 2013 and there are users who want to use xt_LOG in all network namespaces. This patch allows to do so by setting /proc/sys/net/netfilter/nf_log_all_netns to a nonzero value. This sysctl is only accessible from init_net so that one cannot switch the behaviour from inside a container. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> 2017-02-02netfilter: add and use nf_ct_set helperFlorian Westphal3-6/+3 Add a helper to assign a nf_conn entry and the ctinfo bits to an sk_buff. This avoids changing code in followup patch that merges skb->nfct and skb->nfctinfo into skb->_nfct. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> 2017-02-02skbuff: add and use skb_nfct helperFlorian Westphal4-8/+8 Followup patch renames skb->nfct and changes its type so add a helper to avoid intrusive rename change later. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> 2017-02-02netfilter: reset netfilter state when duplicating packetFlorian Westphal1-1/+1 We should also toss nf_bridge_info, if any -- packet is leaving via ip_local_out, also, this skb isn't bridged -- it is a locally generated copy. Also this avoids the need to touch this later when skb->nfct is replaced with 'unsigned long _nfct' in followup patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> 2017-02-02netfilter: conntrack: no need to pass ctinfo to error handlerFlorian Westphal1-6/+6 It is never accessed for reading and the only places that write to it are the icmp(6) handlers, which also set skb->nfct (and skb->nfctinfo). The conntrack core specifically checks for attached skb->nfct after ->error() invocation and returns early in this case. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>