4 Questions Network/System Architects Should Answer About Scheduling

Scheduling in a switch/router is a subject about which much has been written and published over the last two+ decades. In fact, ever since the days of ATM networks, back in the early 90’s, discussions of switch scheduling have kept many a network engineer engaged in heated discussions! Initially at the ATM Forum and later at the IETF, network and systems engineers have spent a good deal of time debating its fine points in Working Groups such as Diffserv, Traffic Engineering, and MPLS to name just a few.

What’s With Scheduling?

Yet, scheduling is front-and-center in many discussions of “Quality-of-Service” or “QoS” even today, and more important than ever in the world of  real-time video, multi-player gaming apps, HD video conferencing, and high-frequency trading. 🙂 (Case in point, HotNet 2015 had a paper titled “Universal Packet Scheduling” by researchers at UC Berkeley that examines whether there is a scheduling discipline that can emulate, to within some reasonable approximation, several other scheduling disciplines.)

For, if your traffic isn’t scheduled quite right at the switch/router level, getting performance at the network level would be nearly impossible.

What’s Implemented Out There?

Thus, it was interesting when this subject surfaced again in the Carrier Ethernet Group earlier this year http://bit.ly/1JZP6nX (thanks Shaam Naragund!). This lead to some very interesting discussions and even more questions and knowledge-sharing, which inspired me to make another one of my “Network Architecture & Design Series” videos, where we specifically answer the following 4 questions:

  1. Which scheduling mechanisms do operators use most often?
  2. Which scheduling mechanisms are widely implemented in switching/routing systems?
  3. Why are different mechanisms used in different parts of the network?
  4. When is queueing & scheduling even relevant?

What Does Scheduling Depend On?

It turns out that the scheduling mechanism used depends on 3 factors: thenetwork segment under consideration, the capabilities of the geardeployed, and the nature of applications running on the network. As such, different mechanisms are required (and thus, implemented) in different segments of the network – the access, the metro/edge, and the core.

Which Mechanisms Are Most Common in Switch/Routers?

To be able to appreciate why that is, we need to first specify the mechanisms most commonly found in devices. These include:

  1. FIFO (First-In, First-Out)
  2. RR/WRR (Round Robin, Weighted Round Robin)
  3. WFQ/CB-WFQ (Weighted Fair Queueing, Class-Based Weighted Fair Queueing)
  4. DRR/WDRR (Deficit Round Robin, Weighted Deficit Round Robin)
  5. PQ/SP/LLQ + RR/WRR/WFQ/DRR/WDRR (Priority Queueing/Strict Priority/Low Latency Queueing + any of the earlier mechanisms)
  6. P-WFQ (Priority-Weighted Fair Queueing) – a set of priority queues, with WFQ or a variant used between subsets of queues at the same priority.
  7. Hierarchical Scheduling: where there are multiple levels of schedulers, apportioning the link bandwidth at successive levels of the hierarchy.

In this video http://bit.ly/1UndK4e, we first go over the significance of each of these mechanisms (the actual operation of these schedulers is explained in a companion video http://bit.ly/1R9qerK, via a simple example that is used to illustrate the operation of each of the schedulers above; this also is something we have not seen explained in quite this way before).

We then present our take on which mechanisms (based on our investigations), we believe, are most widely implemented in switches/routers and we highlight which mechanisms are implemented in different network segments and why.

Of course, in our video we come at the issue from a pragmatic, network practitioner’s perspective, asking (and answering) what is actually (mostly) done today, and attempting to explain why that is so.

I would encourage you to watch the video for the details, it is hard to spill all the beans of an involved video here! 🙂

When is Scheduling Even Needed?

Finally, it behooves us to ask, when is scheduling relevant or needed?

Now if the service provider offers SLAs that include delay and delay variation (jitter) for different classes of traffic, scheduling algorithms become important, because they are the means used to regulate delay in the system. However, if the SLAs only pertain to frame loss or availability, then queueing and scheduling has relatively little impact, and the service provider does not need to be discerning in its choice of the scheduler, and could go with the default scheduler of the switch/router line card or interface.

What is Your Experience?

What is your experience with schedulers and scheduling mechanisms? As a system architect, a network architect, a data center architect, or even an ASIC/chip designer, what scheduling algorithms have you implemented? Or, which ones have you analyzed? Which schedulers do you think are most commonly implemented? Why?

I would love to hear from you in the comments below, or in response to the original discussion on the Carrier Ethernet Group http://bit.ly/1JZP6nX!

Best,

-Vishal