Options for High Availability in Network Switches

This article explores the options available for high availability in network switches at the device level, operating system software level, Quality of Service and configuration changes, notification level.

Gone are those days when we used to just plug in the cable in the RJ-45 Jack of the network switch and forget everything else. But also gone are the days when Network switches were used only for basic interconnectivity of PC’s. Almost every application now runs on the IP back bone. So, why not consider the options that are available for high availability of these network switches?

Network switches, in a way, are expected to be very reliable. They form the most important component of a Local Area Network, but still when it comes to their availability – the least procedures are followed. When Layer 3 Switches are being used in the edge, it is better to know the options that are available to keep them up and running as much as possible.

High Availability at the device level:

Ideally, all the components that are prone to failure in a network switch – like Fan trays, power supplies, control modules, interface cards and switch fabrics should be redundant, and are redundant in the core level or sometimes even in the aggregation level. But it helps if they are field replaceable as well as hot-swappable, to make it easy for system administrators to replace them on the fly. Certain switches allow for configuring external power supplies for creating a redundancy, if the main power supply fails.

Chassis based Switches allow for redundancy of processor engines. But this always increases the cost. It is ok at the core level, but at the aggregation and edge level, some switches support stacking. Stacking is the process by which two or more switches are connected together at the back plane level with a high throughput, to make them work like a single switch. With stacking, some vendors support redundancy of the Route Processing Engine, by making one of them primary, and the other secondary. This is useful for fail over, when the primary switch route processing engine fails. The routing functions can also be handed over to a secondary switch when primary one fails, if the redundant protocol – VRRP (Virtual Router Redundancy Protocol) is enabled. Path resiliency is provided by the use of protocols like Rapid Spanning Tree protocol (Layer 2) and OSPF (Layer 3), at the network level.

With stacking, link aggregation can also be enabled which enables two or more network connections between switches and a data transfer equal to the total capacity of all the links.

High Availability at the Operating System software level:

The operating system of a Network Switch can be modular. Meaning, the operating system is sub-divided in to various modules according to their functionality and each module is a separate entity, and if there is a bug or failure in a particular module, the remaining modules are not affected and the failed module can be restarted independently without bringing the whole switch down. There are certain Network Switch OS that allow up gradation without any downtime.

One of the reasons why a Layer 3 network is deployed at access/edge level is to have a single control plane from the access layer to the core layer. Such an architecture eliminates the need for Spanning Tree protocol and there is only Layer 3 protocols to administer and troubleshoot.

Quality of Service in Network Switches (QoS):

One of the advantages of enabling Quality of Service mechanisms throughout the Switching infrastructure is having a common set of queuing, traffic management and congestion control algorithms. It helps if the switches/ network OS is from a single vendor but with open standards, this is also possible with multiple vendor network switches. Applying QoS mechanisms not only enable latency sensitive applications like voice and video, but they also ensure that there is no congestion at a single port or single switch level which consequently affects the performance of significant portion of the network.

Configuration Changes and Notifications:

When there is a configuration change to be made, some network operating systems support trying out the changes in a virtual environment for checking out the syntax etc. without affecting the live configuration. Some even offer an optional feature that gets back the previous stable configuration if the changes caused some errors.

Most of the network switch operating systems generate alerts or automatically cut off certain systems if some parameters go beyond threshold levels. Like high temperature or very high CPU usage etc. In case of high CPU usage, it can automatically cut off some low priority processes to ensure continuity.

excITingIP.com

You could stay up to date on the various computer networking technologies by subscribing to this blog with your email address in the box mentioned as ‘Get email updates when new articles are published’