Redundancy vs. Resiliency. These two terms are often confused, but the simple fact is that you can’t have one without the other and both are critical to designing and deploying a highly available network solution. Why should you care? Failure to understand these two basic principles is the basis of every network outage and performance issue.
For the first entry, we’re going to talk about redundancy and resiliency in the physical design. I’m Jim Schrader and those that know me have heard a lot about network architecture and design. Why? Because, in every single instance of a network outage or performance issue, the fault lies in the design of redundancy and resiliency. So, let’s dig a little deeper into these two terms.
Redundancy defines the deployment or provisioning of duplicate devices or systems in critical areas to take over active operation if the primary device or system fails.
Resiliency defines the ability to recover, converge or self-heal to restore normal operations after a disruptive event. In order for resiliency to exist, we must have redundant systems.
Redundancy & Resiliency in Real-life
Let’s use a simple example to better differentiate these:
Looking at the illustration above, we have redundant fiber optic connections between the two buildings. So, we have a redundant solution. But, is this a resilient solution? No, we actually have two single points of failure that would render the redundant fiber optic cable useless:
- Demarcation points labeled “D”
- Underground pipe
We tend to call trenchers and backhoes, “fiber optic cable finders,” as they seem to be extremely effective at doing so. Demarcation points represent the same problem where we have a single entry point to the building where the two redundant fiber optic cables come together. This allows a single event to takeout both fiber optic cables.
Looking the diagram above, we have met the requirements for redundancy with two fiber optic links. But, is this a resilient solution? Yes, as we addressed the two single points of failure previously noted.
By adding additional demarcation points we now have physically separate paths into the building meaning that an event at any one demarcation point will still leave an alternate path.
More importantly, we have routed each fiber optic cable via a physically separate conduit ensuring that those cable finders can’t take us out by cutting a single underground pipe. Note that it’s critical that as much distance as possible geographically separates the underground pipes. Putting them right next to each other would obviously defeat the purpose.
This is a simple example, but it effectively demonstrates that simple redundancy via two devices, systems or fiber optic cable does not always result in resiliency.