ISIS NSF

Table of Contents

Introduction

ISIS is the routing protocol preferred for SD-Access (SDA). Roughly said, SDA is somewhat similar to routed access. We can think of fabric edge nodes as access switches when comparing them to our traditional flat networks. Many companies buy multiple switches and deploy them in stacks using Cisco StackWise technology. This has the usual benefits of stacking, namely collapsing all of the switches in the stack into just one management and control plane. We might see something similar to the below topology with SDA.

isis-nsf-topology

Here a border connects to an edge node using two routed links configured for ISIS point to point network type. Much like with MPLS L3VPNs where the IGP is used to provide reachability between PE nodes loopback addresses for BGP, in SDA we use ISIS to provide reachability for the loopbacks between fabric nodes for LISP as RLOC and VXLAN as VTEP.

Focusing on the edge node, the high availability of the stack is implemented much like in chassis based switches with dual RPs. One switch has the active role acting as the running RP and another switch is elected to be standby for the active switch. The active switch holds both the management plane and control plane. In case it fails, a switchover (using SSO) happens and the standby switch now becomes the active switch and resumes the roles of the failed device. This means that it must also take control of ISIS, populate the RIB, and install forwarding entries in the FIB. Non-Stop Forwarding (NSF) can be used to minimize the downtime otherwise caused with the RP switchover. Two solutions for NSF with ISIS exists. Both are described in the next sections.

Cisco Solution

This stateful solution works by synchronising the adjacency state and LSDB to the standby switch. Here it is a matter of how quickly the switchover can happen, because the neighbor is not involved in the process. Due to the now stateful switchover, the switch can deceive the neighbor like nothing happened.

IETF Solution

The standards based solution requires active participation of both the router with a failed RP (NSF capable) and its neighbor (NSF aware). A new TLV, the Restart TLV (211), has been added to ISIS to support this solution.

A restarting router (failover from Active to Standby RP) sends an IIH with the Restart TLV RR bit set. This makes the receiving neighbor keep the adjacency in an “up” state and sends an IIH in response with the RA bit set, to acknowledge this restart. Also, in addition to the RA flag, the neighbor lets the restarting router know the remaining time, which is the time allowed for its recovery (before adjacency is torn down).

After resynchronization of the LSDB, SPF can run over all reinitialized information to the RIB and FIB. Any routes that are not refreshed are purged after a hold time period to limit black holing and routing loops.

Configuration

Configuring NSF for ISIS is fairly simple:

router isis 1
 nsf ietf

If you prefer to use the Cisco solution or you do not have nsf-aware routers, you replace the ‘ietf’ keyword with ‘cisco’.

Conclusion

NSF ensures forwarding of packets during an RP switchover while resuming control plane operation in the background. It is simple to configure and should be considered as part of your ISIS configuration if you have a device that has multiple RPs.

Jacob Zartmann avatar
Jacob Zartmann
Passionate Network Engineer thriving for challenges and knowledge.