跳转至

StarCure: Achieving Resilient and Performance-Guaranteed Routing in Space-Terrestrial Integrated Networks

(1) 论文背景与问题

  • 背景: 空地一体化网络 (STINs) 由低轨 (LEO) 卫星组成, 运行在频繁变化且容易发生故障的太空环境中
  • 挑战:

    • 高动态性: 卫星高速运动导致拓扑结构频繁变化 (如链路通断)
    • 故障多发: 面临太空碎片碰撞, 辐射等环境风险, 以及小型卫星自身的高故障率
  • 现有方案局限:

    • 反应式路由 (Reactive): 在故障发生后重新计算, 收敛慢, 导致网络连通性差
    • 主动式路由 (Proactive): 预先计算备份路由, 但由于 STINs 的故障场景极其庞大, 导致计算和存储开销过大, 难以实际应用

(2) 核心解决方案: STARCURE

论文提出了一种名为 STARCURE 的新型弹性路由机制, 旨在实现快速恢复并保证低延迟, 高带宽的服务能力. 它主要包含两个关键技术:

  • 拓扑稳定模型 (Topology-Stabilizing Model, TSM):

    • 核心思想: 通过引入 "逻辑拓扑" 的概念, 消除拓扑的不确定性
    • 实现方式:
      • 将由故障 (无论是可预测的卫星运动还是突发的链路中断) 引起的拓扑变化, 转化为稳定逻辑拓扑上的流量变化
      • 例如, 将链路中断视为该链路上的突发流量占满了带宽
  • 自适应混合路由方案 (Adaptive Hybrid Routing Scheme):

    • Basic Routing: 针对可预测的故障 (如周期性的卫星运动), 利用约束优化器在离线阶段预计算路由表, 保证全局性能最优
    • Protection Routing: 针对不可预测的突发故障, 采用基于位置的快速重路由策略, 利用局部信息迅速恢复连接, 直到基础路由完成重新计算

Introduction

Thanks to emerging innovations in the aerospace industry, in the past few years we have witnessed the rapid evolution and deployment of satellite Internet constellations (SIC) in low earth orbits (LEO), such as SpaceX’s Starlink [1] and Amazon Project Kuiper [2]. Such broadband constellations facilitate the construction of space-terrestrial integrated networks (STINs), regarded as an important direction of the next generation of Internet, promising to realize pervasive, high-throughput, lowlatency network services for terrestrial customers [3], [4], [5].

Towards the goals above, network routing plays a critical role in the service quality of STINs, since it not only determines the reachability between any two communication ends in the network, but also affects the achievable network performance perceived by customers. Ideally, a STIN routing mechanism is expected to simultaneously: (i) maintain high network reachability for geo-distributed customers during any period of operation; and (ii) provide low latency and high throughput paths for delivering various Internet traffic over the STIN.

However, due to a series of unique characteristics of LEO satellites, achieving highly available and performant routing is still challenging in STINs. First, the backbone network of a STIN is exposed in outer space, suffering from risks such as debris collisions [6], [7] and radiation hazards [8] etc. Second, LEO satellites are constantly moving at a high velocity in their orbits, and such continuous dynamics can lead to frequent inter-visibility changes and link disruptions. Finally, emerging mega-constellations are constructed by shorter-lifespan small satellites, which significantly reduce the production cost, but are inherently more brittle and prone to failures [9]. All factors above can result in node or link failures in a STIN. How should STIN service providers cope with various network failures effectively in such a failure-prone, intermittent environment?

Today’s widely deployed Internet routing protocols, such as OSPF and ISIS, deal with failures in a reactive manner, relying on global link state advertisements to discover network topology changes and compute correct routing tables. However, reactive solutions have to experience a convergence period jeopardizing routing stability. Since network failures occur frequently and constantly in STINs, directly applying such reactive solutions could lead to incessant routing convergence, resulting in very poor network reachability in STINs.

As alternative solutions, many existing works propose to tackle network failures in a proactive manner and accomplish convergence-free routing [10], [11], [12], [13], [14], [15], [16]. The underlying idea of these efforts is to pre-compute the correct routing tables for possible failure scenarios in advance, and then perform fast re-routing once a real failure happens. However, it is also difficult to directly apply such proactive methods to a STIN environment for two main reasons. First, the combination of huge constellation scale, constant LEO dynamics and the complex error-prone environment jointly create a significantly large amount of possible failure scenarios and topology variations. Pre-computing decisions for all these possible scenarios can involve prohibitive computation overhead. Second, satellites are resource-constrained, and storing too many backup routing tables for all failures at each node can easily overwhelm the storage system of satellites.

In this paper, we present S TAR C URE, a novel resilient routing mechanism for emerging STINs. S TAR C URE targets at achieving fast and efficient routing restoration while maintaining the low-latency, high-bandwidth service capabilities for STINs in error-prone, constantly-dynamic, resource-constrained space environments. Specifically, S TAR C URE incorporates two key techniques to cope with various network failures effectively.

First, S TAR C URE adopts a new network model, called topology-stabilizing model (TSM) (§IV-B) to convert the topology variations under various failure scenarios to traffic variations, and formulate the resilient space routing problem upon a stable network topology. TSM exploits two important insights obtained from STINs: (i) a long-duration traffic demand affected by a predictable failure can be modeled as a series of consecutive demands issued by different source-destination pairs; and (ii) an unexpected link failure can be viewed as a burst traffic fully exhausting that link. Therefore, with TSM, S TAR C URE converts the original resilient routing problem which requires pre-calculating routing decisions for (nearly) an infinite number of topology variations, to a dynamic routing scheduling problem upon a stable logical network topology.

Second, to solve the above dynamic routing problem in an efficient and practical manner, S TAR C URE incorporates an adaptive hybrid routing scheme (§IV-C). While TSM enables us to solve the resilient routing problem upon a stable topology, there still remains two practical issues that have to be addressed. On one hand, the conversion by TSM significantly increases the traffic variations and makes it challenging to use standard linear programming to solve the problem efficiently. On the other hand, iterating all possible traffic demands generated by unexpected failures in advance for pre-calculation can still involve significant computation overhead. To overcome these practical problems, our hybrid routing scheme combines a dynamic-tolerant basic routing to efficiently adapt and handle predictable failure with guaranteed network performance, together with a location-guided protection routing to quickly deal with unexpected failures and maintain routing continuity.

To validate the feasibility and effectiveness of S TAR C URE, we implement a S TAR C URE prototype, and build a hardwarein-the-loop testbed which can create a large-scale simulated STIN environment to load real routing software and network traffic for experimentation (§V). Extensive evaluations based on realistic constellation information, traffic pattern and various failure events demonstrate that, as compared to other resilience solutions, S TAR C URE can protect routing against both common, predictable failures and rare, unexpected failures for different constellation topologies, achieving close-to-100% network reachability and better performance after the restoration.

Summarily, the contributions of this paper can be concluded as follows: (i) we formulate the resilient and performant space routing problem, and quantitatively highlight the technical challenges caused by the combination of LEO dynamics and failure uncertainty (§III); (ii) we present S TAR C URE, a novel routing mechanism that incorporates TSM and an adaptive hybrid routing scheme to achieve resilient and performance-guaranteed routing in failure-prone STINs (§IV); (iii) we implement a S TAR C URE prototype (§V) and conduct extensive evaluations driven by realistic constellation information to demonstrate the feasibility and effectiveness of our solutions (§VI).

得益于航天工业的新兴创新, 过去几年我们见证了低地球轨道 (LEO) 卫星互联网星座 (SIC) 的快速演进与部署, 例如 SpaceX 的 Starlink 和 Amazon 的 Project Kuiper. 此类宽带星座促进了空地一体化网络 (STINs) 的构建, 这被视为下一代互联网的重要方向, 有望为地面用户提供无处不在, 高吞吐量且低延迟的网络服务 [3]-[5].

为实现上述目标, 网络路由在 STINs 的服务质量中扮演着关键角色, 因为它不仅决定了网络中任意两个通信端之间的可达性, 还影响用户感知到的实际网络性能.

理想情况下, STIN 路由机制应同时满足以下要求:

  1. 在任何运行期间为地理分布的用户维持高网络可达性
  2. 为传输各类互联网流量提供低延迟和高吞吐量的路径

然而, 由于 LEO 卫星的一系列独特特性, 在 STINs 中实现高可用性和高性能的路由仍具挑战性:

  1. STIN 的骨干网络暴露在外层空间, 面临空间碎片碰撞和辐射危害等风险
  2. LEO 卫星在轨道上持续高速运动, 这种持续的动态性导致频繁的星间可见性变化和链路中断
  3. 新兴的巨型星座由寿命较短的小型卫星构建, 虽显著降低了生产成本, 但本质上更为脆弱且易发生故障

所有上述因素都可能导致 STIN 中的节点或链路故障. STIN 服务提供商应如何在这样一个故障频发且具有间歇性的环境中有效应对各种网络故障?

当前广泛部署的互联网路由协议 (如 OSPF 和 ISIS) 以 反应式 (reactive) 方式处理故障, 依赖全局链路状态通告来发现网络拓扑变化并计算正确的路由表. 然而, 反应式方案必须经历收敛期, 这会损害路由稳定性. 由于 STINs 中网络故障频繁且持续发生, 直接应用此类反应式方案可能导致无休止的路由收敛, 从而导致 STINs 的网络可达性极差.

作为替代方案, 许多现有工作提出以 主动式 (proactive) 方式解决网络故障并实现无收敛路由 [10]-[16]. 这些工作的基本思想是针对可能的故障场景预先计算正确的路由表, 并在实际故障发生时执行快速重路由.

然而, 由于两个主要原因, 此类主动式方法也难以直接应用于 STIN 环境:

  1. 巨大的星座规模, 持续的 LEO 动态性以及复杂的易错环境共同导致了极大量的可能故障场景和拓扑变化. 为所有这些可能场景预先计算决策涉及过高的计算开销
  2. 卫星资源受限, 在每个节点存储针对所有故障的过多备份路由表极易耗尽卫星的存储系统

本文提出了 StarCure, 一种面向新兴 STINs 的新型弹性路由机制. StarCure 旨在实现快速且高效的路由恢复, 同时在故障频发, 持续动态且资源受限的空间环境中维持 STINs 的低延迟, 高带宽服务能力. 具体而言, StarCure 融合了两项关键技术以有效应对各种网络故障.

首先, StarCure 采用了一种新的网络模型, 称为拓扑稳定模型 (TSM) (§IV-B), 将各种故障场景下的拓扑变化转化为流量变化, 并在稳定的网络拓扑上构建弹性空间路由问题.

TSM 利用了从 STINs 中获得的两个重要洞察:

  1. 受可预测故障影响的长时流量需求可被建模为 由不同源-目的对发出的一系列连续 需求
  2. 意外的链路故障可被视为 完全耗尽该链路的突发流量

因此, 借助 TSM, StarCure 将原本需要为 (近乎) 无限数量的拓扑变化预计算路由决策的弹性路由问题, 转化为稳定逻辑网络拓扑上的动态路由调度问题.

其次, 为了以高效且实用的方式解决上述动态路由问题, StarCure 引入了自适应混合路由方案 (§IV-C). 虽然 TSM 使我们能够在稳定拓扑上解决弹性路由问题, 但仍需解决两个实际问题. 一方面, TSM 的转化显著增加了流量变化, 使得利用标准线性规划高效求解该问题变得具有挑战性. 另一方面, 预先迭代所有由意外故障产生的可能流量需求进行预计算仍涉及巨大的计算开销. 为克服这些实际问题, 我们的混合路由方案结合了动态容忍的基础路由 (以高效适应并处理可预测故障且保证网络性能) 与基于位置的保护路由 (以快速应对意外故障并维持路由连续性).

为验证 StarCure 的可行性和有效性, 我们实现了 StarCure 原型, 并搭建了一个硬件在环测试床, 该测试床能够创建大规模的仿真 STIN 环境以加载真实的路由软件和网络流量进行实验 (§V). 基于真实星座信息, 流量模式及各种故障事件的广泛评估表明, 与其他弹性解决方案相比, StarCure 能够针对不同的星座拓扑保护路由免受常见可预测故障及罕见意外故障的影响, 实现接近 100% 的网络可达性并在恢复后提供更优的性能.

综上所述, 本文的贡献总结如下:

(i) 我们形式化了具有弹性且高性能的空间路由问题, 并定量强调了由 LEO 动态性与故障不确定性结合所带来的技术挑战 (§III)

(ii) 我们提出了 StarCure, 一种融合 TSM 与自适应混合路由方案的新型路由机制, 旨在故障频发的 STINs 中实现弹性且有性能保证的路由 (§IV)

(iii) 我们实现了 StarCure 原型 (§V), 并基于真实的星座信息进行了广泛的评估, 以证明我们方案的可行性与有效性 (§VI)

Preliminaries for STINs

A. STINs Quick Primer

alt text

Figure 1 illustrates a typical architecture of STINs in brief, which integrates two major components to provide Internet service from space for terrestrial users: (i) a space backbone network, which consists of hundreds to thousands of LEO broadband satellites (i.e., satellite routers). These satellites can be equipped with high-speed ground-satellite links (GSL, e.g., Ka/Ku/V-band radio links) and inter-satellite links (ISL, e.g., laser communication links [17]) to construct a high-capacity backbone network to forward data in space; (ii) a ground facility network, including a large number of geo-distributed ground stations (and satellite terminals) to enable terrestrial users and content providers to access the space backbone. The entire STIN runs certain space routing mechanisms (e.g., [3], [14], [18]) to forward network traffic and establish end-toend communications. Collectively, integrating wide-coverage satellite constellation and high-speed communication links, emerging STINs promise to provide pervasive, high-throughput and low-latency Internet services globally [4], [19].

图 1 简要展示了空地一体化网络 (STINs) 的典型架构, 该架构整合了两大主要组件, 旨在从太空为地面用户提供互联网服务:

(i) 空间骨干网, 由数百至数千颗低轨 (LEO) 宽带卫星 (即卫星路由器) 组成. 这些卫星可配备高速星地链路 (GSL, 例如 Ka/Ku/V 波段射频链路) 和星间链路 (ISL, 例如激光通信链路 [17]), 以构建用于在空间转发数据的高容量骨干网络;

(ii) 地面设施网, 包含大量地理位置分散的地面站 (及卫星终端), 旨在支持地面用户和内容提供商接入空间骨干网. 整个 STIN 运行特定的空间路由机制 (例如 [3], [14], [18]) 以转发网络流量并建立端到端通信.

综上所述, 通过整合广覆盖的卫星星座与高速通信链路, 新兴的 STINs 有望在全球范围内提供无处不在, 高吞吐量且低延迟的互联网服务 [4], [19].

B. Failure-Prone STIN Environments

Unlike conventional terrestrial networks where the backbone is deployed in a sealed, protected circumstance, the space backbone of a STIN is exposed in a public, uncontrollable environment. Network failures, including node and link failures, are prone to happen due to a series of unique characteristics.

与骨干网部署在封闭, 受保护环境中的传统地面网络不同, STIN 的空间骨干网暴露在公共且不可控的环境中. 由于一系列独特的特性, 包括节点故障和链路故障在内的网络故障极易发生.

LEO dynamics. LEO satellites fly in low orbital altitude (e.g., 500-1200km [1], [2]) to enable low propagation latency for space-ground communication. These satellites move at a high orbital velocity relative to the earth surface. Due to the LEO dynamics, ground-satellite links in a STIN can experience frequent disruptions and re-associations, resulting in frequent network-wide topology fluctuations. A recent investigation [5] has quantitatively shown that the average space-ground link churn interval could be as low as tens of seconds in Starlink.

LEO 动态性 (LEO dynamics):

LEO 卫星在低轨道高度 (例如 500-1200km [1], [2]) 运行, 以实现星地通信的低传播延迟. 这些卫星相对于地球表面以极高的轨道速度运动. 由于 LEO 的动态性, STIN 中的星地链路可能经历频繁的中断与重连, 导致频繁的全网拓扑波动. 最近的一项研究 [5] 定量表明, 在 Starlink 中, 平均星地链路切换 (churn) 间隔可能低至数十秒.

Environmental risks in complex outer space. Satellites working in the outer space suffer from a number of environmental risks such as debris collision and radiation hazard etc. For example, Kessler Syndrome [6] is a phenomenon in which the amount of junk in orbit around earth reaches a threshold where it creates more and more space debris, causing serious failures for satellites. On February 10, 2009, an inactive Russian communications satellite, collided with an active commercial communication satellite operated by Iridium [7]. More recently, a geomagnetic storm doomed 40 Starlink Internet satellites [8].

复杂外层空间中的环境风险 (Environmental risks in complex outer space):

运行在外层空间的卫星面临诸多环境风险, 如空间碎片碰撞和辐射危害等. 例如, 凯斯勒效应 (Kessler Syndrome) [6] 描述了一种现象, 即地球轨道上的垃圾数量达到临界值, 从而产生越来越多的空间碎片, 导致卫星发生严重故障.

2009 年 2 月 10 日, 一颗报废的俄罗斯通信卫星与一颗由铱星公司 (Iridium) 运营的在轨商业通信卫星发生碰撞 [7].

更近期的案例是, 一场地磁暴导致 40 颗 Starlink 互联网卫星损毁 [8].

Small satellite vulnerability. Emerging STINs are built upon small satellites. They involve much lower cost compared to traditional monolithic satellites [20], require shorter manufacturing period, and can use available commercial-off-the-shelf (COTS) technologies to quickly build their on-board systems. On one hand, such a “build it as cheap as possible” principle indeed accelerates the construction and deployment of STINs. But on the other hand, due to the reduced cost, small satellite systems are more vulnerable to failures [9], and typically have a shorter lifespan and weaker radiation resistance. For example, after its first launch in 2019, SpaceX’s Starlink has already launched over 3,300 small satellites to space as of the date of Decemeber, 2022, but 353 (∼11%) of the deployed satellites have become decaying or deorbited right now [21].

In a nutshell, STINs are operated an error-prone, constantly-dynamic environments. All factors above can lead to network failures. Note that failures caused by predictable LEO dynamics are common, frequent and transient. Other failures due to unexpected factors such as collisions are rare but can lead to permanent outright errors. We denote the above two classes of failures as predictable and unexpected failures respectively.

小型卫星的脆弱性 (Small satellite vulnerability):

新兴的 STINs 构建于小型卫星之上. 与传统的大型单体卫星相比 [20], 它们的成本显著降低, 制造周期更短, 并且可以利用现有的商用现货 (COTS) 技术快速构建星载系统.

一方面, 这种 "尽可能低成本构建" 的原则确实加速了 STINs 的建设与部署;

但另一方面, 由于成本的降低, 小型卫星系统更容易发生故障 [9], 且通常寿命较短, 抗辐射能力较弱.

例如, 自 2019 年首次发射以来, 截至 2022 年 12 月, SpaceX 的 Starlink 已向太空发射了超过 3,300 颗小型卫星, 但其中 353 颗 (约 11%) 已部署的卫星目前已处于衰减或离轨状态 [21].

简而言之, STINs 运行在一个易出错且持续动态的环境中. 上述所有因素都可能导致网络故障. 值得注意的是, 由可预测的 LEO 动态性引起的故障是普遍, 频繁且短暂的; 而由碰撞等意外因素引起的其他故障虽然罕见, 但可能导致永久性的彻底故障. 我们将上述两类故障分别定义为 可预测故障意外故障

Understanding the Problem

坏消息: 路况极其复杂 (Failure Model)

论文把这个迷宫里的 "路断了" 分成了两种截然不同的情况:

  1. "按时刻表下班" (可预测故障 / Type-I)
    • 原理: 这是由卫星运动造成的. 比如卫星飞到了地球背面, 或者飞出了地面站的视野
    • 特点: 虽然路断了, 但是它是完全可预测的. 就像公交车司机换班一样, 我们看着时刻表就知道某条路在几点几分会断
  2. "突发车祸" (不可预测故障 / Type-II)
    • 原理: 这是由意外造成的. 比如卫星被太空垃圾撞坏了, 或者太阳风暴导致设备死机, 甚至只是程序 Bug
    • 特点: 完全随机, 不知道什么时候发生, 也不知道哪里发生

游戏目标: 既要快, 又要稳 (Problem Formulation)

你作为 "总调度师", 算法需要达成一个 "完美的平衡" (即公式中的最大化效用 U):

  1. Reachability: 不管迷宫怎么变, 路怎么断, 包裹必须送到, 不能丢
  2. Performance: 路要选最近的 (低延迟), 而且不能选那条已经堵车的路 (高带宽/不拥塞)
  3. Constraints: 每条路的载重是有限的 (带宽限制), 而且进站多少包裹就要出站多少包裹 (流量守恒)

终极挑战: 为什么现在的办法都不行?

这一节解释了为什么要写这篇论文, 因为现有的两个主流办法在这个 "太空迷宫" 里都玩不转:

  • 挑战 A: 地图太多, 存不下 (针对预计算)

    • 如果我们要把未来所有可能出现的迷宫地图都算一遍, 把所有可能发生的 "车祸" 应对方案都写好
    • 问题: 卫星每秒都在动, 加上随机的车祸, 可能的情况是天文数字. 卫星那点可怜的内存根本存不下这么多 "备用地图"
  • 挑战 B: 反应太慢, 来不及 (针对实时计算)

    • 如果不存地图, 等出事了再临时算路线呢?
    • 问题: 卫星网络变化太快了. 等你发现路断了, 把消息广播全网, 再大家一起算出新路线, 几秒钟已经过去了, 这时候迷宫的结构可能又变了. 这会导致网络一直处于 "正在计算中", 包裹根本发不出去

The StarCure Design

这一章 (IV. THE STARCURE DESIGN) 是整篇论文的核心灵魂. 如果说上一章是在描述 "迷宫很难走", 这一章就是 StarCure 拿出的 "通关秘籍"

论文作者用了很多数学公式 (如线性规划, 对偶问题) 来证明其严谨性, 但抛开这些符号, 其核心思想非常巧妙, 可以总结为两招: "障眼法""组合拳"

第一招: "障眼法" -- 拓扑稳定模型

核心痛点: 卫星网络一直在变 (路一直在断, 一直在连), 如果每次变都重新画地图, 重新算导航, 计算机受不了.

StarCure 的解决思路: 假装地图没变

StarCure 发明了一种叫做 TSM (Topology-Stabilizing Model) 的方法, 把物理上的 "结构变化" 转化为了逻辑上的 "流量变化"

  • 原来的看法 (物理视角):

    • "哎呀! 前面那条路 (链路) 断了! 路消失了! 快重新算路!"
    • 结果: 系统惊慌失措, 重新计算全局, 耗时耗力
  • StarCure 的看法 (逻辑视角 - TSM):

    • 它构建了一张 "虚拟的, 永远存在的全连接地图"
    • 应对卫星运动 (可预测的断连):
      • 这就像 "接力赛". 虽然负责传数据的卫星A飞走了, 换成了卫星B, 但在 TSM 看来, 这不叫 "路断了", 而是把原来的长途运输任务, 切分成了几段不同的 "接力任务". 路还在那, 只是换了人跑
    • 应对突发故障 (不可预测的断连):
      • 当某条链路真的被撞坏了, StarCure 不会把这条路从地图上删掉. 它会路由算法说: "嘿, 这条路没断, 但是突然来了个隐形的超级车队把这条路彻底堵死了 (带宽被占满)"
    • 结果: 路由算法一看, "哦, 路还在, 就是堵死了走不动", 于是自然地寻找其他路径

总结: 通过这种 "障眼法", StarCure 也就是把无数张变化的地图, 变成了一张固定的地图 + 变化的交通拥堵状况. 对于算法来说, 计算 "怎么绕开拥堵" 比计算 "地图重构" 要快得多, 容易得多

第二招: "组合拳" -- 自适应混合路由

有了上面的 "固定地图", 具体怎么导航呢? StarCure 发现, 只用一种方法不行, 必须 "文武双全"

(1) "文官": 基础路由 (Basic Routing) -- 负责日常, 算得准

  • 针对对象: 可预测的变化 (比如卫星按轨道运行产生的切换)
  • 工作方式:
    • 既然卫星轨迹是写在时刻表上的, 我们就在地面或者闲暇时, 用强大的算法 (论文里提到的线性规划优化) 提前算好一张 "完美时刻表"
    • 它就像城市的公交系统. 虽然早晚高峰路况不同, 但公交线路是提前规划好的, 效率最高, 能保证大家都按时到达
  • 优点: 全局最优, 延迟低, 带宽利用率高

(2) "武将": 保护路由 (Protection Routing) -- 负责救急, 反应快

  • 针对对象: 不可预测的突发故障 (比如设备坏了, 被碎片撞了)
  • 工作方式:
    • 一旦发生突发故障, "文官" 算好的公交路线走不通了, 这时候不能停下来重新规划公交线 (太慢了)
    • StarCure 立刻切换到 "武将" 模式. 利用卫星的地理位置信息 (GPS 坐标), 就近找一条离终点更近的路绕过去
    • 它就像出租车老司机. 前面突然封路了, 司机根本不需要向总部汇报, 直接看一眼地图, 方向盘一打, 穿小巷子绕过去
  • 优点: 不需要全网广播, 不需要等待, 毫秒级反应, 保证只有局部受到影响, 其他地方照常运行

alt text

alt text

alt text