• MystikIncarnate@lemmy.ca
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 months ago

    The benefits are pretty simple but have broader implications than what would be apparent on the surface.

    Let me lay down a little ground work first. Traditionally with routing protocols you need to implement a /30 between interfaces on the connected devices before routing will come up. Usually that requires the use of IPAM, and a lot of record keeping to ensure nothing overlaps.

    So let’s take the example of a relatively simple spine and leaf topology. A leaf switch dies, or otherwise needs replacing. You set up the new leaf with a template, which contains pretty much all the routing commands you’ll need, and all of your overlay transport, VLAN definitions, and whatever. After that, you need to program the uplink interfaces to the spine(s) - hopefully at least two - in order to get it online.

    If you’re doing a replacement because a switch died, looking up the interface IP assignments for the leaf is going to take a lot of time, nevermind programming the addresses, and all the possible fat finger typos that could happen, just to get the switch communicating in your underlay (and to your management systems).

    In small networks, not a big deal, you’re dealing with maybe a dozen such devices at most, but in large scale provider, datacenter, or hyperscale networks with literally hundreds of racks, each with a top-of-rack leaf switch, good luck.

    Enter IP unnumbered. Same situation. You can pre-prepare any standby switches with unique loopback IPs in the routing system, and mark them as used in the IPAM for a standby device. A failure happens, you grab a standby switch and head to the rack. Next you yank all of the port connections out and plug them into the standby switch and power it up ASAP. Without touching the config at all, it grabs the routing and comes online, and the NOC can simply apply the port config for that rack on that switch from their management console.

    This can easily cut repair time in half or better.

    Any switch can be moved anywhere in the enjoyment and it will come online right away.

    • iknowitwheniseeit@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      So this isn’t about routing really, rather about optimizing standby routers for recovery.

      A few things make me nervous.

      First, the description of the work involved seems to imply that your setup really needs more automated tooling. Nontrivial, but you’ve already mentioned typos, and that this is for large operations.

      Second, using IPv4 for your management network is wasteful and needlessly complicated. Even if your customer traffic is all IPv4, there’s really no reason to use legacy protocols for internal routing.

      • MystikIncarnate@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 months ago

        None of this is real, everything I said was hypothetical to demonstrate the point.

        I get what you’re trying to say, but what you’re saying is in favor of unnumbered compatible routing protocols.

        I do not presently work in a provider or datacenter scale environment, and of the few that I’ve seen that I’ve been able to “peek behind the curtain” so to speak, the issues I’m pointing at are very real.