blog

EVPN/VXLAN Network Health Verification: A Comprehensive Guide

Written by Max Urmanov | Mar 3, 2025 7:57:04 PM

In today's data center environments, EVPN/VXLAN has become the de facto standard for building scalable and flexible network fabrics. As these deployments grow in complexity, having a structured approach to verify fabric health becomes crucial. This guide walks through a systematic verification process to ensure your EVPN/VXLAN fabric is operating optimally.

Understanding EVPN/VXLAN Fabric Architecture

EVPN (Ethernet VPN) combined with VXLAN (Virtual Extensible LAN) provides a robust overlay technology that addresses many traditional data center networking challenges:

  • Layer 2 extension across Layer 3 boundaries
  • Efficient handling of BUM (Broadcast, Unknown Unicast, and Multicast) traffic
  • MAC and IP mobility support
  • Multi-tenancy capabilities
  • Control plane-based learning for scalability

A typical EVPN/VXLAN fabric consists of spine and leaf switches in a Clos topology. The underlying network (underlay) provides IP connectivity between the switches, while the overlay network handles the end-to-end tenant communication.

 

 

Network Fabric Verification: A Layered Approach

Effective troubleshooting of EVPN/VXLAN networks requires a bottom-up methodology. We'll work through each layer systematically, starting with the underlay and progressing to the overlay components.

1. Underlay Network Verification

A. Check IGP Peering State

The foundation of any EVPN/VXLAN fabric is the underlay network. In our example, we're using OSPF, but it could be IS-IS, BGP, or even static routing.

Start by verifying OSPF neighbor relationships between spine and leaf switches:

dc01-spine01# show ip ospf neighbors 
 OSPF Process ID UNDERLAY VRF default
 Total number of neighbors: 6
 Neighbor ID     Pri State            Up Time  Address         Interface
 dc01-r01-leaf01   1 FULL/ -          4w5d     172.16.1.2      Eth1/1 
 dc01-r02-leaf01   1 FULL/ -          4w5d     172.16.1.10     Eth1/2 
 dc01-r03-leaf01   1 FULL/ -          4w5d     172.16.1.18     Eth1/3 
 dc01-r01-leaf02   1 FULL/ -          4w5d     172.16.1.6      Eth1/4 
 dc01-r02-leaf02   1 FULL/ -          4w5d     172.16.1.14     Eth1/5 
 dc01-r03-leaf02   1 FULL/ -          4w5d     172.16.1.22     Eth1/6
 

What to look for: All neighbors should be in the FULL state, which indicates a fully established adjacency.

B. Confirm Loopback Reachability

Next, check if all loopback interfaces are reachable across the fabric:

dc01-spine01# show ip route ospf | grep -A 2 /32
10.255.255.2/32, ubest/mbest: 6/0
    *via 172.16.1.2, Eth1/1, [110/81], 4w5d, ospf-UNDERLAY, intra
    *via 172.16.1.6, Eth1/4, [110/81], 4w5d, ospf-UNDERLAY, intra
--
10.255.255.3/32, ubest/mbest: 1/0
    *via 172.16.1.2, Eth1/1, [110/41], 4w5d, ospf-UNDERLAY, intra
 

What to look for: Each loopback address should have valid routes. For leaf switches in VPC pairs, you'll see multiple paths to the same loopback (for redundancy).

Pro Tip: Enable name lookup for your IGP to simplify troubleshooting:

dc01-spine01# show ip ospf | i Name
 Name Lookup is enabled
 

2. Overlay Network Verification

A. Check BGP EVPN Peering State

Once you've verified the underlay, move to the overlay by checking BGP EVPN peerings:

dc01-spine01# show bgp l2vpn evpn summary 
BGP summary information for VRF default, address family L2VPN EVPN
BGP router identifier 10.255.255.1, local AS number 65000
BGP table version is 16255, L2VPN EVPN config peers 6, capable peers 6
24 network entries and 24 paths using 5760 bytes of memory
BGP attribute entries [12/2016], BGP AS path entries [0/0]
BGP community entries [0/0], BGP clusterlist entries [0/0]
Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.255.255.3    4 65000   51063   54741    16255    0    0     4w5d 4         
10.255.255.4    4 65000   51062   54743    16255    0    0     4w5d 4         
10.255.255.5    4 65000   53546   53458    16255    0    0     4w5d 8         
10.255.255.6    4 65000   53545   53468    16255    0    0     4w5d 8         
10.255.255.7    4 65000   48285   55938    16255    0    0     4w5d 0         
10.255.255.8    4 65000   48273   55924    16255    0    0     4w5d 0
 

What to look for: All neighbors should be established with prefixes being received (State/PfxRcd column). Note that some leafs might show 0 prefixes if they don't have any VNIs configured.

B. Verify NVE Interface Status

The NVE (Network Virtualization Edge) interface is the VTEP (VXLAN Tunnel Endpoint) that handles encapsulation and decapsulation:

dc01-r01-leaf01# show interface nve 1
nve1 is up
admin state is up,  Hardware: NVE
  MTU 9216 bytes
  Encapsulation VXLAN
  Auto-mdix is turned off
  RX
    ucast: 0 pkts, 0 bytes - mcast: 0 pkts, 0 bytes
  TX
    ucast: 0 pkts, 0 bytes - mcast: 0 pkts, 0 bytes
 

What to look for: The NVE interface should be up with the proper source interface configured.

For VPC deployments, verify both switches in the pair:

dc01-r01-leaf01# show nve interface nve 1
Interface: nve1, State: Up, encapsulation: VXLAN
 VPC Capability: VPC-VIP-Only [notified]
 Local Router MAC: 5003.0000.1b08
 Host Learning Mode: Control-Plane
 Source-Interface: loopback0 (primary: 10.255.255.3, secondary: 10.255.255.101)
 

Key insight: For VPC pairs, note the secondary IP address that serves as the common VTEP IP for both switches.

C. Check VNI Mappings and Status

VNIs (VXLAN Network Identifiers) are mapped to VLANs (for L2 VNIs) or VRFs (for L3 VNIs):

dc01-r02-leaf01# show nve vni
Codes: CP - Control Plane        DP - Data Plane          
       UC - Unconfigured         SA - Suppress ARP        
       SU - Suppress Unknown Unicast 
       Xconn - Crossconnect      
       MS-IR - Multisite Ingress Replication
Interface VNI      Multicast-group   State Mode Type [BD/VRF]      Flags
--------- -------- ----------------- ----- ---- ------------------ -----
nve1      900100   UnicastBGP        Up    CP   L2 [100]                
nve1      900101   UnicastBGP        Up    CP   L2 [101]                
nve1      9003911  n/a               Up    CP   L3 [DB]
 

What to look for: All VNIs should show Up state with the correct mode (typically CP for Control Plane).

For detailed information about specific VNIs:

dc01-r02-leaf01# show nve vni 900100 detail 
VNI: 900100 
  NVE-Interface       : nve1
  Mcast-Addr          : UnicastBGP
  VNI State           : Up
  Mode                : control-plane
  VNI Type            : L2 [100]
  VNI Flags           : 
  Provision State     : vni-add-complete
  Vlan-BD             : 100
  SVI State           : n/a

Critical check: The Provision State should show vni-add-complete, indicating successful configuration.

3. Verify VTEP Peering

Examine NVE peering to ensure proper connectivity between VTEPs:

dc01-r02-leaf01# show nve peers 
Interface Peer-IP                                 State LearnType Uptime   Router-Mac       
--------- --------------------------------------  ----- --------- -------- -----------------
nve1      10.255.255.6                            Up    CP        3w6d     5006.0000.1b08   
nve1      10.255.255.101                          Up    CP        4w4d     0200.0aff.ff65
 

Important note: NVE peers only show as Up when EVPN routes are learned from that peer. The absence of a peer doesn't necessarily indicate a problem if no shared VNIs exist.

4. Examine BGP EVPN Routes

The final and most detailed verification is examining the BGP EVPN routes:

dc01-r01-leaf01# show bgp l2vpn evpn vni 9003911
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 19504, Local Router ID is 10.255.255.3
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup, 2 - best2
 
   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 10.255.255.3:3    (L3VNI 9003911)
*>l[2]:[0]:[0]:[48]:[5003.0000.1b08]:[0]:[0.0.0.0]/216
                      10.255.255.101                    100      32768 i
 
What to check:
  • Valid routes (marked with * and >)
  • Proper RD (Route Distinguisher) and RT (Route Target) values
  • Expected route types for your use case:
    • Type 2 routes for MAC/IP learning
    • Type 3 routes for multicast
    • Type 5 routes for IP prefix advertisement

For detailed inspection of specific routes:

dc01-r01-leaf01# show bgp l2vpn evpn 10.10.101.17 vrf DB
Route Distinguisher: 10.255.255.3:3    (L3VNI 9003911)
BGP routing table entry for [2]:[0]:[0]:[48]:[5001.0011.0000]:[32]:[10.10.101.17]/272, version 2495
Paths: (2 available, best #1)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW
Multipath: iBGP
  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported from 10.255.255.5:32868:[2]:[0]:[0]:[48]:[5001.0011.0000]:[32]:[10.10.101.17]/272
 

Advanced verification: Check for proper import/export of routes, correct extended communities (RT values), and EVPN encapsulation types.

Common Issues and Troubleshooting Tips

 

  1. Underlay connectivity problems:
    • Check physical interfaces for errors
    • Verify MTU consistency across the fabric
    • Ensure IGP adjacencies are stable
  2. EVPN peering issues:
    • Verify BGP authentication if used
    • Check for route-map policies that might be filtering routes
    • Confirm ASNs are configured correctly
  3. VNI provisioning failures:
    • Look for configuration mismatches between switches
    • Verify VLAN to VNI mappings are consistent
    • Check for resource constraints (hardware limitations)
  4. Traffic forwarding problems:
    • Verify end-to-end MTU (jumbo frames typically required)
    • Check hardware programming of MAC/IP entries
    • Confirm symmetric routing for L3 traffic

Best Practices for EVPN/VXLAN Deployments

  1. Use consistent naming conventions across your fabric
  2. Implement proper IP addressing scheme with dedicated ranges for underlay and overlay
  3. Standardize VNI allocation (e.g., L2VNI = VLAN ID + prefix)
  4. Document RD/RT allocation strategy to avoid conflicts
  5. Configure BFD for faster failure detection where appropriate
  6. Monitor fabric for MTU issues as they can be difficult to troubleshoot
  7. Back up configurations regularly using automation tools

Conclusion

EVPN/VXLAN fabric verification requires a methodical approach, working from the underlay up through the overlay components. By following this structured checklist, you can ensure your fabric is healthy and operating as expected.

Remember that while individual commands are useful, the real power comes from correlating information across different layers of the fabric. Developing a good understanding of how EVPN routes translate to forwarding decisions will make you much more effective at troubleshooting complex issues.

For your specific environment, consider building automated verification scripts that can regularly check the health of your fabric and alert on any deviations from the expected state.