BGP VxLAN EVPN – Part 2: Underlay

myarbitrarystuff

7 years ago

In the previous post, found here I provided an overview of BGP VxLAN EVPN and mentioned that various IGP’s could be utilised to provide the underlay. In this post I am going to flesh out what a potential underlay setup may look like based on OSPFv2.

There are some initial considerations which need to be defined when planning the underlay design. Some of these considerations are:

MTU
Unicast Routing Protocol
IP addressing
Multicast for BUM traffic replication

VxLAN adds 50 Bytes to the original Ethernet frame which needs to be catered for to avoid fragmentation. The simplest way of doing this is to enable Jumbo frames in the IP network where VxLAN will run. As most servers utilise a jumbo frame of 9000 it is recommended that the switches be configured with a Jumbo frame of 9192 / 9216 depending on what the model of hardware supports. This will cater for the servers 9000 plus the VxLAN overhead.

The next consideration is which IGP (unicast routing protocol) to utilise, however as mentioned this post will focus on OSPF.

IP addressing for the underlay needs to cater for the P2P links between the spine and leaf switches, the loopback interfaces on each spine and leaf switch and the multicast Rendezvous-Point (RP) address.

Whilst discussed in more detail later in this post it should be noted that the mode of multicast utilised will likely depend on the model of hardware which is being utilised. For example on the Cisco Nexus range, unfortunately, not all Nexus models support the same multicast mode. Below is a list of what is supported on each Nexus model:

Nexus 1000v – IGMP v2/v3
Nexus 3000 – PIM ASM
Nexus 5600 – PIM BiDir
Nexus 7000/F3 – PIM ASM / PIM BiDir
Nexus 9000 – PIM ASM

In this example we will leverage the loopback address for our multicast RP address, however as an example for a medium sized spine and leaf deployment utilising 4 spine switches and 20 leaf switches the following IP address usage needs to be considered:

4 Spine x 20 leaf = 80 P2P Links
80 links, with an IP address at each end = 160 P2P IP addresses
24 devices in total = 24 Lookpack IP addresses.
Total = 160 P2P IP + 24 Loopback IP = 184 IP Addresses

Also note that to conserve IP addresses, ‘IP unnumbered loopback 0’ for the P2P interfaces, may be used, which means 1 IP address per device. This should be seriously considered for large deployment, however for simplicity in this example I am going to utilise 2 Spine switches and 3 Leaf switches and thus a unique IP address everywhere, meaning I need to cater for:

2 Spine x 3 leaf = 6 P2P links x 2 = 12 P2P IP addresses + 6 Loopback IP addresses.

Also I am going to assume that in this example that the servers are utilising the 10/8 IP address range, and thus I have opted to use the 192.168/16 range for the Loopback interfaces which are also used as the Router ID and 172.16/12 IP address range for the physical layer 3 P2P interfaces.

Also for reference whilst most of the thoery is independant of the vendor and hardware in this example I am using Cisco Nexus 9000 switches to implement this network technology, and as with all Nexus switches the features first need to be enabled, thus I have enabled the following:

Spine-1#show run | incl feature
feature nxapi
feature ospf
feature bgp
feature pim
feature interface-vlan
feature vn-segment-vlan-based
feature lacp
feature lldp
feature nv overlay

As the spine switches are the simplest to configure I’ll start there with the first spine switch. As mentioned, depending on how MAC address replication and flooding is configured in the environment multicast may be required. I’ll explain this in more detail later, but in this example I have enabled multicast and also nominated this spine switch as one of the RP, with the following commands.

ip pim rp-address 192.168.1.0

ip pim anycast-rp 192.168.1.0 192.168.1.1

ip pim anycast-rp 192.168.1.0 192.168.1.2

Once this is done the next step is to enable the underlay routing protocol. As I am using OSPF to provide IP reachability across the fabric, the first step is to configure the loopback interface which will be used as the router ID for the routing protocol, and then configure OSPF itself.

interface loopback0
description Router-ID – Spine1
ip address 192.168.1.1/32

router ospf UNDERLAY
router-id 192.168.1.1
log-adjacency-changes
maximum-paths 12
auto-cost reference-bandwidth 100000 Mbps
passive-interface default

The router-id is the IP I will use for the loopback0 interface and for all router-id’s defined on this switch.

The OSPF configuration is standard and should be familiar to anyone who has configured OSPF before, however the command ‘maximum paths’ may not be. This is enabled to provide Equal Cost Multi-Pathing between my leaf and spine switches. I chosen 12 just to have a large number and likely never need to worry about it again, but as long as this is equal to, or greater than, the amount of physical links it will be fine. Also it is always good practice to define the reference bandwidth, and in this example I have configured 100000 Mbps which is 100 Gbps and should cater for the largest link this environment will have. Also I prefer to manually nominate any interfaces I wish to participate in OSPF thus I have configured the interfaces to be passive by default.

TIP: By default OSPF is uses broadcast for message propergation and election, however we want to utilise the Network type P2P thus, ensure that ‘ip ospf network point-to-point’ on loopback and P2P interfaces is configured.

Once this is done I can go back into the loopback interface and assign the OSPF and Mulicast parameters so the loopback interface participates in these protocols, with the following configuration:

interface loopback0

description Router-ID – Spine1

ip address 192.168.1.1/32

ip ospf network point-to-point

ip router ospf UNDERLAY area 0.0.0.0

ip pim sparse-mode

The next step is to configure the point to point interfaces and enable OSPF and Multicast. As we are using VxLAN we are going to increase the MTU to cater for the additional header size. Technically only an additional 50 bytes is required but for simplicity I’ve decided to enable jumbo frames and set the mtu to 9216 on all physical interfaces.

interface Ethernet1/43

description – DC01-LSL06-03 [Eth1/47]

mtu 9216

ip address 172.16.1.1/30

ip ospf network point-to-point

no ip ospf passive-interface

ip router ospf UNDERLAY area 0.0.0.0

ip pim sparse-mode

no shutdown

Its important to configure OSPF as point to point here to ensure there is no DR/BDR and thus no election as well as being a more optimised LSA database, and avoiding a full SPF calculation for a link failure. Also as we have nominated passive-interface default in OSPF we need to enable this interface to participate in OSPF with the command ‘no ip ospf passive-interface’. I have also used a /30 for the point to point link which is not ideal for preserving IP address space and may cause scale issues in a very large deployment but for simplicity of configuration and troubleshooting I’ve decided the trade of here is fine.

All the interconnects between the leaf and spine switches are via 2 x 10G interfaces thus I need to replicate the above configuration on an additional interface as per the following configuration.

interface Ethernet1/44

description – DC01-LSL06-03 [Eth1/48]

mtu 9216

ip address 172.16.1.5/30

ip ospf network point-to-point

no ip ospf passive-interface

ip router ospf UNDERLAY area 0.0.0.0

ip pim sparse-mode

no shutdown

This should be repeated for all links between each spine and leaf adjusting the IP addresses as required until all of your switches form a neighbor relationship as shown here:

Spine-1# show ip ospf neighbors

OSPF Process ID UNDERLAY VRF default

Total number of neighbors: 6

Neighbor ID Pri State Up Time Address Interface

192.168.1.13 1 FULL/ – 1w5d 172.16.1.2 Eth1/43

192.168.1.13 1 FULL/ – 1w5d 172.16.1.6 Eth1/44

192.168.1.12 1 FULL/ – 1w5d 172.16.1.10 Eth1/45

192.168.1.12 1 FULL/ – 1w5d 172.16.1.14 Eth1/46

192.168.1.11 1 FULL/ – 1w5d 172.16.1.18 Eth1/47

192.168.1.11 1 FULL/ – 1w5d 172.16.1.22 Eth1/48

Also as we enabled multicast PIM earlier, to confirm this has formed the appropriate neighbor relationships we use the following command:

Spine-1# show ip pim neighbor

PIM Neighbor Status for VRF “default”

Neighbor Interface Uptime Expires DR Bidir- BFD

Priority Capable State

172.16.1.2 Ethernet1/43 1w5d 00:01:42 1 yes n/a

172.16.1.6 Ethernet1/44 1w5d 00:01:35 1 yes n/a

172.16.1.10 Ethernet1/45 1w5d 00:01:26 1 yes n/a

172.16.1.14 Ethernet1/46 1w5d 00:01:23 1 yes n/a

172.16.1.18 Ethernet1/47 1w5d 00:01:34 1 yes n/a

172.16.1.22 Ethernet1/48 1w5d 00:01:44 1 yes n/a

Spine-1# show ip pim interface brief

PIM Interface Status for VRF “default”

Interface IP Address PIM DR Address Neighbor Border

Count Interface

Ethernet1/43 172.16.1.1 172.16.1.2 1 no

Ethernet1/44 172.16.1.5 172.16.1.6 1 no

Ethernet1/45 172.16.1.9 172.16.1.10 1 no

Ethernet1/46 172.16.1.13 172.16.1.14 1 no

Ethernet1/47 172.16.1.17 172.16.1.18 1 no

Ethernet1/48 172.16.1.21 172.16.1.22 1 no

loopback0 192.168.1.1 192.168.1.1 0 no

Note: As this example is from the spine switch and each spine has 2 x 10G links to the 3 x leaf switches, there are 6 entries plus the loopback (depending on which command used) above.

This now has formed the underlay network with OSPF and Multicast and we can now build the overlay and control plane network above this. It is critical that reachability of the underlay is consistent across the fabric and this may be a good point to test failure scenarios for the underlay. It is a good point however to finish this blog, with the next providing the overlay and control plane configuration details.

Share this: