In this post we’ll be covering a couple of topics from the Inside Out of the link-state protocols that have always been ambiguous and full of details, we’ll try to make them as crystal clear as we can.
Both link-state routing protocols consider MTU in order to prevent any related problems, mainly loss of routing information due to large routing messages being dropped (consider an OSPF LSU or an IS-IS LSP that is over sized and thus dropped), however each protocol tests the MTU in a different way as we’ll see in the upcoming section.
OSPF requires routers to have matching MTUs in order to become adjacent (this is a little bit far from what really happens as we’ll see later), the MTU size is exchanged in the Interface MTU field in the Database Description Packets and thus the adjacency won’t be established unless the MTU is checked. This means that the neighbors will never go beyond the EXSTART state if the MTU check fails. What really happens is that when the router receives the DDP, it checks the interface MTU field, if the value received from the neighbor is higher than the IP MTU configured on the inbound interface over which the adjacency should be established, the DBD packet is rejected (section 10.6 of the RFC 2328) and the adjacency never goes beyond the EXSTART state.
NOTE I quote from RFC 2328 section 10.6 “Receiving Database Description Packets”: “If the Interface MTU field in the Database Description packet indicates an IP datagram size that is larger than the router can accept on the receiving interface without fragmentation, the Database Description packet is rejected.”
NOTE To avoid any confusion, you’ll generally find DD, DBD or DDP used to denote Database Descriptor Packets.
Cisco has a hack to this operation, which is to simply ignore the MTU mismatch using the “ip ospf mtu-ignore” IOS interface command, this command is only required on the side with the lower IP MTU since it is the side that will cause the issue according to the OSPF implementation as illustrated earlier. However Juniper has no workaround for this, you’ll have to match the IP MTU on both sides.
On the other hand IS-IS pads the Hello PDUs using the 255 bytes Padding TLVs (type code 8 – remember that the length field is 1 byte, and thus a maximum length of 255 bytes, and thus multiple Padding TLVs might be used). According to the standards (ISO/IEC 10589) IS-IS routers must be capable of receiving PDUs of 1492 bytes in length (ReceiveLSPBufferSize). If the MTU is mismatched the hello packets are dropped and thus the adjacency is never established. However continuing padding the hello PDUs is considered useless and waste of resources and thus many implementations provides a hack to stop continuing padding the hellos PDUs and just pad the initial few hello PDUs or until a certain stage. Cisco and Juniper implementations does this in different manner.
NOTE Only Hello PDUs are padded for MTU mismatch detection.
NOTE ISO/IEC 10589 calls the MTU the dataLinkBlocksize.
Cisco provides the “no isis hello padding” IOS interface command or the “no hello padding” IOS IS-IS command to pad only the first 5 hello PDUs and stop padding after. However Juniper provides another methodology, Juniper supports three modes for hello padding: adaptive (Pad until the neighbor is UP) / loose (Pad until the neighbor is initializing) / strict (always pad hello PDUs) – By default Juniper uses Loose Hello Padding.
NOTE The 1497 bytes came from subtracting the following fields from the 1518 bytes maximum Ethernet Frame size: 6 bytes source MAC address, 6 bytes destination MAC address, 2 bytes Length field, 3 bytes DSAP, SSAP and Control byte and 4 bytes FCS.
Cisco pads the Hello PDUs up to 1497 bytes since it uses 802.3 LLC Ethernet frames for IS-IS (called SAP in Cisco’s terminology, check the “show clns interface” command), however Juniper which also uses 802.3 LLC Ethernet frames for IS-IS pads the Hello PDUs up to 1492 bytes (following the standards) assuming that SNAP Ethernet frames might be used (which has never happened), which adds an additional 5 bytes overhead (a 3-byte Organizational Unit Identifier (OID) followed by a 2-byte Protocol ID). However JUNOS uses an ISO MTU of 1497 and thus you won’t face an interoperability issue.
NOTE On a side note, the two 802.2 variants of Ethernet encapsulation (the LLC and the LLC/SNAP) are not widely used, the Ethernet II frame is the widely used encapsulation (it is used by IP packets).
DR vs DIS:
Both OSPF and IS-IS use the logic of designated router (designated intermediate system in the case of IS-IS) and pseudonode on broadcast networks to reduce the link state information flooding/synchronization from O(N^2) to O(N) and optimize the SPF calculation, where N is the number of nodes. However both protocols acts completely different.
NOTE OSPF elects DR on multiaccess networks (broadcast and NBMA), however IS-IS does not recognize NBMA network types, and thus only elects DIS on broadcast networks.
The pseudonode terminology is an IS-IS terminology, however the same logic is implied with OSPF but not using the same terminology (according to RFC2328). The pseudonode introduces a mechanism to reduce redundant link-state information from being flooded and thus reduces flooding and consequently optimizes the SPF calculation. In IS-IS the broadcast network is represented as a pseudonode (a virtual node), each non-DIS router advertises adjacency only to this pseudonode, while the DIS advertises a pseudonode LSP beside its ordinary one, the pseudnode LSP advertises neighborship to all the routers (including the DIS itself with a metric of 0 since it is a virtual node). On the other hand with OSPF the DR originates a network-LSA (Type2 LSA – similar to the pseudonode LSP) on behalf of the network (pseudonode), and this LSA lists the set of routers (including the Designated Router itself) currently attached to the network.
NOTE In IS-IS the nonzero LSP ID is what differentiates a pseudonode LSP from a non-pseudonode LSP and is chosen by the DIS to be unique among any other LAN circuits for which it is also the DIS at this level – LSP ID consists of the system ID (if hostname resolution is active it is displayed as the hostname rather than SysID in the show commands), circuit ID, and LSP number fields – non-pseudonode LSPs have 00 in the Circuit ID field (example: R5.00-00), while a pseudonode LSP has a non 0 Circuit ID (example: R5.02-00 – This means that R5 is the DIS) – The second half of the ID represents fragments (LSP number), this means that if only one LSP packet is used it will always be 00.
Now let’s go into details regarding the designated router election and operation in both protocols.
OSPF: DR election is a non-deterministic non-preemptive sticky process (the first router to join the network is most probably going to be the DR, and a newly joining router can’t compete to be the DR until the failure of the current DR), due to the synchronization process complexity a BDR is elected as well, and all routers on the LAN are only synchronized with the DR and BDR.
NOTE OSPF uses a timer called the WaitTimer, it states the wait time for electing the first designated router on the segment. It is always set to the router dead interval according to RFC2328 (40 seconds by default) and helps to guarantee that all operational routers have the opportunity to receive and send hello packets before the election occurs, a router is not allowed to elect a DR or a BDR until it transitions out of Waiting state (in order to try to avoid the first router joining the network from being the DR).
The router with the highest priority wins, the priority value varies from 0 to 255, with 0 meaning ineligible to enter the election. In case of priority tie the router with highest RID wins the election. Once a DR is elected for the segment, the remaining routers then elect the BDR for redundancy, and the same election criteria are used. In case of the failure of the current DR the BDR transitions to be the DR, and a new election is performed to determine the new BDR and so on.
NOTE Cisco uses a DR priority of 1, while Juniper uses a DR priority of 128, however the RFC didn’t recommended a value.
OSPF DROthers form full adjacency only with the DR and BDR, and accordingly they synchronize their LSDB only with the DR and BDR. The DR and BDR are also adjacent, and the BDR synchronizes with the DR just like the DROthers. Unlike IS-IS (as we’ll see later) OSPF uses different multicast address for the OSPF packets (AllSPFRouters:126.96.36.199 and AllDRouters:188.8.131.52). The use of a BDR reduces the impact of a failed DR since with OSPF the database synchronization is a complex multistate process.
IS-IS: DIS election is a deterministic preemptive process (highest priority then highest SNPA), and there is no backup DIS logic and thus a new DIS is elected when the current one goes down (synchronization is fast and simple, unlike OSPF). Whenever the DIS is preempted with a newer DIS, the new DIS purges the pseudonode LSP generated by the old DIS and originates its own pseudonode LSP and all other routers synchronize their LSDB with the new DIS.
NOTE The priority range is between 0 and 127, both Cisco and Juniper defaults to 64 according to ISO/IEC 10589.
If all interface priorities are the same, the router with the highest subnetwork point of attachment (SNPA) is selected. The SNPA is the MAC address on a LAN, and the local data link connection identifier (DLCI) on a Frame Relay network. If the SNPA is a DLCI and is the same at both sides of a link, the router with the higher system ID becomes the DIS.
The Hello and hold-time timer values are changed for the elected DIS routers, non-DIS routers uses 30 seconds (Cisco) or 27 seconds (Juniper) hold-time, if the router is elected to be the DIS the hold-time is reduced to 10 seconds (Cisco) or 9 seconds (Juniper). The Hello timer is still (hold time / 3), which results in a Hello PDU every 3 seconds (for both Cisco and Juniper). These quicker intervals allow the non-DIS routers to notice the loss of the DIS in a timely manner and elect a new DIS.
NOTE With IS-IS neighbors don’t agree on a common hold-time (and accordingly hello timer), each neighbor treats its peer according to the hold-time it advertises in its hello packet.
Beside the advertising of the pseudonode LSP, the DIS is the only router eligible to send periodic CSNPs for database synchronization (and as an implicit ack), and is the only router eligible to reply on PSNPs on the broadcast network – remember that with IS-IS a single multicast MAC address is used for Level1 and another one is used for Level2 (CSNPs, PSNPs, and LSPs are all multicast, and all routers on the broadcast network receive them equally), unlike OSPF which uses a multicast IP address for messages destined to all SPF routers and another for the DR/BDR.
Finally, in practice in modern networks Ethernet interfaces are used in a point-to-point fashion (Ethernet is dominating; low prices and high speeds), accordingly it is recommended to configure OSPF and IS-IS over them in a point-to-point fashion, since after all it is no more a multiaccess network, in this way the now unrequired DR/DIS implications and complexity is eliminated and things are kept as optimized and simple as possible.
I hope that I’ve been informative.