RESEARCH AGENDA¶
A network such as Starlink raises many research questions, both for the network itself, and for traffic traversing it. For legacy Internet traffic, reordering must be avoided. Delaybased congestion control such as BBR[3] may not perform well over such a network. The network must be resilient to failures. And it must be capable of routing with low delay, even when traffic levels are high enough to saturate the best paths. We briefly discuss some of these questions.
像Starlink这样的网络引发了许多研究问题,既涉及网络本身,也涉及其上行驶的流量。对于传统互联网流量,必须避免数据包乱序。基于延迟的拥塞控制算法,如BBR[3],可能在这样的网络环境下表现不佳。网络必须具备对故障的韧性,同时,即使在流量足够大以至于饱和最佳路径时,仍需能够实现低延迟的路由。我们将简要讨论这些问题。
Reordering. Reordering is different from that seen on a terrestrial network. So long as queues are not allowed to build in satellites, reordering is completely predictable, as all routes are known several hundred milliseconds in advance. One solution is to maintain a reorder buffer at the receiving groundstation. Packets that arrive over a lower delay path are simply queued until their one-way delay matches that of the higher delay paths. Doing this, the RTT on the 20th best path is still approximately 74ms, less than current Internet RTT.
重排序(Reordering)。 重排序与地面网络中所见的情况不同。只要不允许在卫星中形成队列,重排序是完全可预测的,因为所有的路径都会提前几百毫秒就已知。一种解决方案是,在接收地面站维护一个重排序缓冲区。通过延迟较低路径到达的包将被简单地排队,直到其单程延迟与较高延迟路径的包匹配。通过这种方式,20条最佳路径的往返时间(RTT)仍然大约为74毫秒,低于当前互联网的RTT。
We can do even better if the sending groundstation can annotate packets with a sequence number, a path ID, and the time t last since it sent the last packet on the previous path. When the sending groundstation switches from a higher delay path to a lower delay one, reordering may occur. The first packet to arrive on the new path is identified by the receiving groundstation from the change of path ID. Suppose the known difference in path delays is t dif f . If any preceding packets are missing, the receiving groundstation queues all packets arriving on the new path until either all predicing packets have arrived, or time equal to t dif f − t last has elapsed. After this, all packets sent on the old path should have arrived.
如果发送地面站能够为数据包附加序列号、路径ID以及自上次通过先前路径发送数据包以来的时间t_last,我们可以进一步优化。当发送地面站从较高延迟路径切换到较低延迟路径时,可能会发生重排序。接收地面站可以通过路径ID的变化来识别首先到达新路径的包。假设已知路径延迟的差值为t_diff。如果有任何前面的数据包丢失,接收地面站会将所有通过新路径到达的包排队,直到所有前置包到达,或者时间达到t_diff − t_last为止。之后,所有通过旧路径发送的数据包应该都已到达。
Finally, as the sending groundstation knows future path latency, if there is a queue there that is longer than the difference in path delays, it may take packets from this queue out-of-order, sending them over different latency paths so that they arrive in-order at the receiving groundstation. For high-priority latency-sensitive traffic, we would hope that no such queue ever exists, but we expect that a large volume of lower priority traffic will also be present and fill in around the high-priority traffic. It is this traffic that might use a 20th best path, and it too must not suffer excessive reordering.
最后,由于发送地面站知道未来路径的延迟,如果队列中的数据包数量超过了路径延迟差值,它可以从该队列中按顺序取出包,通过不同延迟的路径发送,从而确保接收地面站按顺序接收。对于高优先级、延迟敏感的流量,我们希望不会出现这样一个队列,但我们预计会有大量低优先级的流量,它们会填充在高优先级流量的周围。正是这些低优先级流量可能会使用第20条最佳路径,而它们也必须避免遭受过度的重排序。
Failures. Such a network is inherently resilient to failures. If an RF transceiver fails, that satellite can still relay through traffic; there are many other satellites within range of a groundstation, so the impact on coverage is minimal. However, all groundstations need to be informed of any failure, so they can factor it in to their routing considerations. If the five transceivers on a satellite are interchangable, then of one failes, the constellation continues to perform almost unchanged so long as the four remaining transceivers are used for the links along the orbital plane, and for the side links. The link between NE-bound and SE-bound satellites is less critical because latency-based routing will often try to avoid such paths (see the latency spike in Figure 7), and other similarlatency paths will be normally be available. Again, everyone needs to know about the failure to factor it in to routing.
故障(Failures)。 这样的网络本质上对故障具有很强的韧性。如果一个射频(RF)收发器发生故障,该卫星仍然可以通过其他卫星中继流量;因为地面站的覆盖范围内有许多其他卫星,所以故障对覆盖的影响是最小的。然而,所有地面站都需要了解任何故障信息,以便将其纳入路由决策中。如果一颗卫星上的五个收发器是可互换的,那么当其中一个故障时,只要剩下的四个收发器用于轨道平面上的链路以及侧向链路,星座的性能几乎不受影响。NE方向和SE方向卫星之间的链路则较不关键,因为基于延迟的路由通常会避免选择此类路径(参见图7中的延迟峰值),并且通常会有其他相似延迟的路径可供选择。再次强调,所有人都需要了解故障信息,以便将其纳入路由决策。
SpaceX have stated that they will have on-orbit spare satellites for each orbital plane—it uses very little fuel to adjust position along an orbital plane, but requires excessive fuel to perform a plane change. However, even without spares, the network has very good redundancy. Gaps in coverage can be routed around - for example, Path 2 in Figure 11 shows the latency achieved between London and New York is all the satellites on Path 1 were unavailable. The same is likely not true though for extreme latitudes, where coverage is much sparser. We note that SpaceX propose 75 satellites per orbital plane for the higher inclination orbits, rather than 50 in other orbits; we speculate that this closer spacing may allow laser links to bypass one failed satellite to reach the next.
SpaceX曾表示,他们将在每个轨道平面上配备在轨备用卫星——调整轨道平面位置消耗的燃料非常少,但进行轨道平面变更则需要过多的燃料。然而,即使没有备用卫星,网络也具备非常好的冗余性。覆盖空白区域可以通过路由绕行——例如,图11中的路径2显示了当路径1上的所有卫星不可用时,伦敦和纽约之间的延迟。尽管如此,对于极地地区,覆盖相对稀疏,情况可能并不相同。我们注意到,SpaceX为高倾角轨道提出了每个轨道平面配置75颗卫星,而其他轨道配置为50颗卫星;我们推测,这种较紧密的卫星间距可能允许激光链路绕过故障的卫星,连接到下一个卫星。
Load-Dependent Routing. All the simulations above assume that no significant queuing happens in the satellites themselves. For high-priority (likely high cost) traffic, this can be ensured by admission control, so long as it forms a minority of the traffic. This is a similar model to that used on the terrestrial microwave links used for high frequency trading. Regular Internet traffic will not get such priority treatment, so a LEO constellation operator needs to perform active traffic engineering to avoid creating hotspots in the network. Others have demonstrated that shortest-path routing on mesh networks is particularly susceptible to creating hotspots[6].
负载依赖路由(Load-Dependent Routing)。 上述所有模拟假设卫星本身不会发生显著的排队现象。对于高优先级(可能是高成本)的流量,通过接纳控制可以确保这一点,只要它在流量中占少数。这与用于高频交易的地面微波链路中使用的模型类似。常规互联网流量则不会得到这种优先处理,因此低地轨道(LEO)星座运营商需要进行主动的流量工程,以避免在网络中产生热点。已有研究表明,网格网络上的最短路径路由特别容易导致热点的产生[6]。
In terrestrial networks, centralized load-dependent routing schemes such as B4[9] and LDR[7] can pro-actively route so as to achieve low latency without causing congestion. These schemes, however, make routing decisions on a minute-byminute basis - too slow for routing on dense LEO constellations. It is an open question whether such schemes can be extended for this use, or if the latency between the controller and groundstations will always be too high.
在地面网络中,集中式负载依赖路由方案,如B4[9]和LDR[7],可以主动地进行路由,从而实现低延迟而不引发拥塞。然而,这些方案是基于分钟级别的路由决策——对于密集的LEO星座路由而言,这种时效性太慢。是否可以将这些方案扩展到这一使用场景,或者控制器与地面站之间的延迟是否总是过高,仍然是一个开放问题。
We postulate that a hybrid solution may work well. High priority low-latency traffic always gets priority, admission control limits its volume, preventing it causing congestion and it gets explicit routing ensuring minimum latency. For the remaining traffic, satellites monitor link load; this is broadcast to all groundstations globally, so everyone is aware of hotspots. Because of the nature of a LEO constellation, these hotspots tend to be geographic rather than topological. Groundstations then randomize their path choice across slightly less favorable paths to load-balance traffic away from hotspots. In a traditional topology, this would likely lead to instability, where traffic flip-flops between the best path and a worse alternate. As our simulations show, dense LEO constellations have very many paths available, and many of them are of similar latency. This allows groundstations to be much more conservative about when they move traffic back to the lowest delay path, using timescales much longer than the latency of the broadcast load reports, so avoiding instability. We believe this is an interesting direction for future routing work on dense LEO constellations.
我们假设一种混合方案可能会有效。高优先级的低延迟流量总是优先处理,接纳控制限制其流量,防止其引发拥塞,并通过显式路由确保最小延迟。对于其余流量,卫星监控链路负载,并将这些信息广播给全球所有地面站,以便每个人都能了解热点。由于LEO星座的特性,这些热点往往是地理性的而非拓扑性的。地面站随后会在稍微不太有利的路径上随机选择,以将流量从热点区域转移,从而实现负载均衡。在传统拓扑中,这种做法可能会导致不稳定性,流量会在最佳路径和较差的备用路径之间反复波动。然而,正如我们的模拟所示,密集的LEO星座提供了许多路径,而且许多路径的延迟非常相似。这使得地面站在将流量切换回最低延迟路径时可以更加保守,使用比广播负载报告的延迟更长的时间尺度,从而避免了不稳定性。我们认为这是未来在密集LEO星座中进行路由研究的一个有趣方向。