跳转至

Consensus-Free Convergence

Consensus-Free Convergence 无共识融合

dSDN combines well-known techniques (flooding, source routing, TE, etc.) into a novel synthesis. As a result, convergence in dSDN plays out differently vs. both cSDN and traditional protocol-based networks. The process by which a network converges after an event — e.g., failure, change in link capacity or traffic demand — impacts its performance and hence, as a precursor to our evaluation in §5, we briefly review cSDN and dSDN's convergence behaviors.

dSDN结合了众所周知的技术(如泛洪、源路由、TE等)形成了一种新的综合方法。因此,dSDN的收敛与cSDN和传统的基于协议的网络不同。网络在事件发生后(例如,故障、链路容量变化或流量需求变化)的收敛过程会影响其性能,因此,作为我们在第5节评估的前提,我们简要回顾了cSDN和dSDN的收敛行为。

A network's convergence time, \(T_{conv}\), consists of three components that manifest(表现) differently in cSDN vs. dSDN:

网络的收敛时间 \(T_{conv}\) 包含三个在 cSDN 和 dSDN 中表现不同的组成部分:

alt text

(1) Propagation Time (\(T_{prop}\)) is the time from when an event occurs to when the TE controller in question learns about the event. In cSDN, link state traverses the cSDN control infrastructure, consisting of both hardware (CPN, servers) and software (operating system components, and services such as topology discovery, etc.), up to a single centralized controller over a time period \(T_{prop}\). In dSDN, NSUs traverse the data plane, with a different \(T_{prop}(i)\) for when each router \(R_i\) learns of the changed state.

(1) 传播时间 (\(T_{prop}\)) 是指从事件发生到相关TE控制器得知事件所需的时间。在cSDN中,链路状态通过cSDN控制基础设施(包括硬件如CPN、服务器,以及操作系统组件和拓扑发现等服务)传递到一个集中控制器,所需时间为\(T_{prop}\)。在dSDN中,NSU在数据平面上传递,每个路由器\(R_i\)了解状态变化的时间为不同的\(T_{prop}(i)\)

(2) Computation Time (\(T_{comp}\)) is the time it takes a controller to run a TE computation with the new event information and generate updated paths. In cSDN, a single central controller runs this computation over time \(T_{comp}\). In dSDN, each router runs the TE computation over time \(T_{comp}(i)\) per router \(R_i\), with a start time dependent on \(T_{prop}(i)\). We generally expect similar \(T_{comp}(i)\) across routers.

(2) 计算时间 (\(T_{comp}\)) 是控制器使用新事件信息运行TE计算并生成更新路径所需的时间。在cSDN中,单个中央控制器在时间\(T_{comp}\)内运行此计算。在dSDN中,每个路由器\(R_i\)耗时\(T_{comp}(i)\)运行TE计算,其开始时间取决于\(T_{prop}(i)\)。我们通常预计各路由器的\(T_{comp}(i)\)相似。

alt text

(3) Programming Time (\(T_{prog}\)) is the time to install computed paths at all routers. cSDN implementations typically implement this by programming forwarding rules at each router [11, 25, 27], which requires care in programming order as naively one can end up with loops or dead ends when some routers in the path are updated while others are not. The commonly deployed solution for this is a two-phase programming process to properly make-before-break paths. All paths are programmed in parallel. For each path of length \(n\), first (a) all \(n - 1\) transit routers for the path are programmed in parallel with their next hop. As they finish programming, they (b) send acknowledgements back to the cSDN server. Upon receiving all \(n - 1\) acknowledgements, (c) the cSDN server sends a command to enable the new path and disable the old one at the head-end router. As mentioned in §2.2, each step in this process often requires going through a hierarchy of layers . By contrast, in dSDN, programming is an entirely local process as the controller locally programs only the paths it originates, and \(T_{prog}(i)\) is when all paths originating from router \(R_i\) are established.

(3) 编程时间 (\(T_{prog}\)) 是在所有路由器上安装计算路径的时间。cSDN的实现通常通过在每个路由器上编程转发规则来实现这一点,这需要注意编程顺序,因为如果路径中的某些路由器更新而其他路由器未更新,可能会导致环路或死胡同。常用的解决方案是采用两阶段编程过程来正确地实现“先建立后拆除”路径。所有路径并行编程。

对于每个长度为\(n\)的路径,首先(a)并行编程路径的所有\(n - 1\)个中转路由器及其下一跳。当它们完成编程时,(b)向cSDN服务器发送确认。收到所有\(n - 1\)个确认后,(c)cSDN服务器发送命令以在源路由器启用新路径并禁用旧路径。如第2.2节所述,此过程中的每一步通常需要通过层次结构。相比之下,在dSDN中,编程是一个完全本地的过程,因为控制器仅本地编程它发起的路径,而\(T_{prog}(i)\)是指从路由器\(R_i\)发起的所有路径建立的时间。

alt text

We see the combined effect in Figure 7, which shows the sequence of messages sent and processed for a single path during convergence after a network event in cSDN vs. dSDN. The full view of convergence would show the controller programming all paths in cSDN, and every router in dSDN receiving the event notification and recomputing and reprogramming the paths it originates accordingly.

在图7中,我们可以看到cSDN和dSDN在网络事件后单一路径收敛期间发送和处理消息的顺序。完整的收敛视图将显示:

  1. cSDN中的控制器编程所有路径
  2. dSDN中的每个路由器接收事件通知并相应地重新计算和重新编程其发起的路径。

Some final observations: first, since we run the same TE algorithm in both cSDN and dSDN, their routes after convergence are identical. Second, both cSDN and dSDN experience incremental convergence across paths; that is, neither cSDN nor dSDN enjoy simultaneous convergence. Rather, in both, different routers learn their new paths at different times. The reason for this “drift” is different for cSDN vs dSDN; in cSDN, it is due to distributed programming, as the controller programs all paths in parallel. In dSDN, it is because headends (and hence the paths they compute) converge independently. In this sense, dSDN’s architecture does not introduce a fundamentally different convergence behavior.

一些最终观察:首先,由于我们在cSDN和dSDN中运行相同的TE算法,它们在收敛后的路径是相同的。其次,cSDN和dSDN都经历了路径的增量收敛;也就是说,cSDN和dSDN都没有同时收敛。相反,在两者中,不同的路由器在不同时间学习到它们的新路径。

这种“漂移”在cSDN和dSDN中原因不同:在cSDN中,这是由于分布式编程,因为控制器并行编程所有路径。在dSDN中,这是因为源端(以及它们计算的路径)独立收敛。从这个意义上说,dSDN的架构并没有引入根本不同的收敛行为。

cSDN and dSDN
  1. Their routes after convergence are identical, since we run the same TE algorithm
  2. Neither cSDN nor dSDN enjoy simultaneous convergence. Different routers learn their new paths at different times

在cSDN和dSDN中,路径收敛的“漂移”原因不同:

  • cSDN:由于分布式编程,控制器需要并行编程所有路径。这意味着不同的路径可能在不同时间完成更新。
  • dSDN:因为每个源端(head-end)独立收敛,导致路径的计算和更新也独立进行。

因此,dSDN的架构并没有在根本上引入不同的收敛行为!

虽然两者的实现细节不同,但它们都存在路径收敛的“漂移”现象,即不同路由器在不同时间学习新路径。这种现象在两种架构中都存在,所以dSDN并没有引入根本不同的收敛行为。