Friday, December 9, 2011

Latency Control Is Good, Jitter Control Is Better

Minimizing Latency
There are good reasons to minimize latency and improve reaction time in control systems. Latency is the "deadliest" perfomance limiter for higher protocol layers in Ethernet such as TCP/IP which is acknowledging frame reception and repeating failed frame sequences. It is very useful to have predictable maximum latency in the system that is designed to have predictable performance. The latency control plays significant role in time-sensitive systems. In IEEE802 Ethernet community, the latency control is frequently taken as a prerequisite for predictable  timing behavior of different traffic classes - streaming data, audio and video (e.g. IEEE AVB). Other relatively similar Ethernet-based standards, such as ARINC664 (AFDX) provide maximum latency guarantees for every Virtual Link (VL) in the network. In others such as IEEE DCB, the minimized (but not bounded!) latency is related to the highest-priority traffic managed by using priority-based flow control with separate buffers for different traffic priorities.
System designers and architects using Ethernet are obsessed with the latency control, because Ethernet as an asynchronous technology was not designed to control jitter, so the latency is seen as an addressable network parameter for communication QoS enhancements.

Jitter in Packet-Switched Ethernet Networks
In asynchronous networks, we use "statistical" multiplexing to allow sharing of bandwidth among different functions or users.
Jitter is a variation in packet latency, and basically a stohastic process, which mainly depends on bandwidth use statistics, packet prioritization and scheduling schemes, traffic shaping and policing methodology. In addition, traffic sources deliver packets periodically, as a continuous stream, and other higher-priority, equal-priority and even lower-priority (head-of-line blocking!!!) network traffic can influence and delay packets in a reasonably shaped periodic stream. In this case, the interference with traffic from other input ports causes latency jitter.  

With excessive jitter or latency swings, timing constraints can be violated for one stream, but  they can also influence behavior of other streams. So the jitter is hard to predict and makes deterministic communication in open systems virtually impossible. 
VLANs and prioritization do not offer reliable control of timing and bandwidth use - therefore we use the term "more deterministic" Ethernet communication.

In order to guarantee predictable maximum latency for time-sensitive traffic in a multi-hop network, traffic sources should send packets periodically with predefined bandwidth use (traffic shaping!), switches should be architected to have constant store-and-forward latency and to shape and/or police incoming packets. All traffic within the time-sensitive traffic class should be higher priority then the best effort traffic.

Latency Bounds Do Not Support Jitter Control

Depending on the approach for latency control in time-sensitive applications, the worst case latency can be several times lower than the average latency, because all NICs can send data at the same time and create traffic bursts.

Having only the maximum bound for latency, we would in fact accept a lot of jitter (latency variability). The maximum latency of e.g. 1.5ms in an 100MBit/s 5-hop Ethernet network would mean that the message jitter can be somewhere between tens of microseconds (high overprovisioning, low bandwidth use) and 1+ miliseconds (a lot of other traffic in the network).
The difference between worst-case and average latency can vary depending on traffic shaping approaches (e.g. compare 802.1Qav and ARINC664-P7).

It is true that the jitter can be reduced under special conditions in closed architectures, but any change in operation or bandwidth use of integrated functions will alter the network performance and jitter. Simulation examples of automotive Ethernet networks with jitter of +/- 30% of latency have been demonstrated. IMHO, the ultimate challenge in those cases (based on aerospace industry experience), is to prove that the simulated/calculated and the real network behavior are equivalent under different operation scenarios.

Preemption Does Not Ensure Jitter Control

The promise of preemption mechanisms currently discussed in IEEE802.1, cannot ensure jitter control in more complex networks where different functions use the highest-priority hard real-time traffic class, as there is no way to exercise preemption on two or more streams of equal priority.  Preemption would work reliably for jitter minimization only if one hard RT stream exist, but this is too hard constraint for real networks.

Jitter Control at Data Link Layer (Layer 2)
Controling jitter means that we are entering the domain of synchronous communication and TDMA bandwidth partitioning, which is in domain of lossless time-critical traffic with hard real-time behavior, ┬Ás-jitter and fixed latency.
Clearly, Ethernet is an asynchronous technology, so the jitter control can be accomplished only by emulating synchronous communication using asynchronous network traffic at data link layer (Layer 2).
Jitter control is possible by synchronizing all network devices (switches, NIC) and by scheduling traffic usingy synchronized network devices - so we can better control QoS for different traffic classes at the OSI Layer 2 (Link) level (see SAE AS6802).

Her Majesty "Jitter" In Real-Time Systems
In time-critical systems and fast control loops, jitter control is more important then the latency issue, as it determines the control loop quality, the accuracy of data sampling (periodicity), and the efficency of computing and networking resource use (!!!) in integrated systems.
The jitter is a large portion of maximum latency defined for faster control loops. So by controlling the jitter we can also reduce the latency for critical packet streams.
In addition, the efficency of resource use is tied to the jitter control and translates directly into the efficiency of embedded system virtualization and system hardware costs. This means we can host more distributed functions with equal amount of computing and networking resources in the system.

The jitter control is essential for design of fault-tolerant and high-integrity systems, and simplifies complexity management in fault-tolerant and integrated modular architectures. By controling the jitter, we can simplify design of robust embedded platforms and critical infrastructure applications.

Furthermore, the jitter management in packet-switched Ethernet networks is essential and unavoidable for the integration of different traffic classes (time-critical, time-sensitive and best-effort) to gain deterministic unified Ethernet networking capability.