跳转至

ROUTING

How well does the network above provide low-latency routes? The simplest way to route is for each groundstation to connect to the satellite that is most directly overhead. This has the advantage of providing the best RF signal strength for uplinks and downlinks. We can then run Dijkstra’s algorithm[5] over the satellite network using link latencies as metrics to provide the lowest latency paths.

网络在提供低延迟路径方面的表现如何?最简单的路由方式是让每个地面站连接到正上方的卫星。这种方式的优势在于为上行和下行提供最佳的射频信号强度。随后,我们可以使用Dijkstra算法在卫星网络上运行,利用链路延迟作为度量标准,以提供最低延迟的路径。

Of course, the network is not static; the satellite most directly overhead changes frequently, the laser links between NE- and SE-bound satellites change frequently, and link latencies for links that are up change constantly. We can, however, run Dijkstra on this topology for all traffic sourced by a groundstation to all destinations, and do so every 10 ms with no difficulty, even on laptop-grade CPUs. In addition, all the link changes are completely predictable. If we run Dijkstra every 50 ms, for the network as it will be 200 ms in the future, and cache the results, we can then see whether packets we send will traverse a link that will no longer be there when the packets arrive. In this way, each sending groundstation can source-route traffic that will always find links up by the time the packet arrives at the relevant satellite.

网络并非静态;正上方的卫星频繁变化,东北(NE)和东南(SE)方向卫星之间的激光链路也时常变动,而已建立链路的延迟则不断波动。然而,我们可以对这一拓扑结构运行Dijkstra算法,以便从每个地面站到所有目的地的流量进行处理,并且即使在笔记本电脑级别的CPU上,每10毫秒运行一次也毫无困难。此外,所有链路的变化都是完全可预测的。如果我们每50毫秒运行一次Dijkstra算法,针对200毫秒后网络的状态并缓存结果,那么我们就可以判断所发送的数据包是否会经过在其到达时已经不存在的链路。通过这种方式,每个发送的地面站都可以源路由流量,确保在数据包到达相关卫星时,总能找到可用的链路。

How, then does the latency change as the network evolves? Figure 7 shows how the RTT from New York to London evolves over three minutes. Discontinuities are due to route changes within the satellite network, or a change of the satellite overhead the source or destination city. For comparison, the minimum possible RTT via optical fiber that follows a great circle path is 55ms, while the actual Internet RTT between two well connected sites in these cities is 76ms. The satellite RTT is, on average, fairly low. It certainly beats the current Internet RTT, and that 55ms great-cicle RTT is not realistic as it is not possible to lay fiber continuously on the most direct path. However, the large delay spike between 70 and 95 seconds is certainly undesirable.

随着网络的演变,延迟是如何变化的?图7展示了纽约到伦敦的往返时间(RTT)在三分钟内的演变情况。中断现象是由于卫星网络内部的路由变化,或者是直接在源城市或目的城市上空的卫星发生变化。作为比较,通过沿大圆路径铺设的光纤所能达到的最小往返时间为55毫秒,而这两个城市之间两个连接良好的站点的实际互联网RTT为76毫秒。平均而言,卫星RTT相对较低,确实优于当前的互联网RTT,而55毫秒的大圆RTT并不现实,因为不可能在最直接的路径上连续铺设光纤。然而,在70到95秒之间的大延迟峰值显然是不可取的。

Further analysis shows that these spikes are caused when the satellites directly overhead the two cities are on different parts of the constellation - either one is on a NE-bound satellite and the other on a SE-bound satellite, or vice versa. Although the fifth laser link on each satellite connects the two parts of the constellation, the path is not always very direct, and these links do not stay up for long as the satellites move.

进一步分析表明,这些峰值是由于直接在两个城市上空的卫星位于星座的不同部分造成的 —— 要么一个在东北(NE)方向的卫星上,另一个在东南(SE)方向的卫星上,反之亦然。尽管每颗卫星上的第五条激光链路连接了星座的两个部分,但路径并不总是非常直接,并且随着卫星移动,这些链路并不会长时间保持连接。

Even if both satellites are on the same part of the constellation, routing vertically upwards to a satellite then horizontally then vertically downwards takes a longer path than necessary. Lower latency can be achieved by using a satellite lower in the sky in the direction of the destination. This is, of course, at the expense of 3dB lower RF signal strength[12], likely resulting in lower achievable bitrate.

即使两颗卫星位于星座的同一部分,垂直向上连接到一颗卫星,然后水平再垂直向下传输的数据路径也会比必要路径更长。通过使用朝向目的地方向、位于天空较低位置的卫星,可以实现更低延迟。当然,这样做会牺牲3dB的射频信号强度,这可能导致可实现比特率降低。

Routing Both RF and Lasers. To achieve the lowest delay, we need to include all possible RF up and down links into the network map that we run Dijkstra over. In this way, we always choose the best matched satellite pair for the uplink and downlink, and we use satellites that are in the correct direction. This usually results in using satellites that are fairly close to 40° from the vertical.

为了实现最低延迟,我们需要将所有可能的射频(RF)上行和下行链路纳入我们运行Dijkstra算法的网络地图中。通过这种方式,我们始终选择最佳匹配的卫星对进行上行和下行,并使用朝向正确方向的卫星。这通常会导致使用与垂直方向相距约40°的卫星。

alt text

Figure 8 shows how the latency between New York and London, San Francisco and London, and London and Singapore varies over three minutes when RF and laser links are co-routed in this manner. The y-axis shows the latency via satellite for that city pair, normalized by the latency via optical fiber laid tight along the great circle route. A value of one therefore shows an unattainable lower bound for optical fiber communication for that city pair. In all three cases, the satellite RTT is significantly less than this lower bound. For comparison, latencies of current Internet paths between well-connected sites in these cities are also shown.

图8展示了当RF和激光链路以这种方式共同路由时,纽约与伦敦、旧金山与伦敦以及伦敦与新加坡之间的延迟在三分钟内的变化情况。纵轴显示了通过卫星进行的该城市对之间的延迟,经过光纤沿大圆路径铺设的延迟进行归一化处理。因此,数值为1表示该城市对之间光纤通信的不可达到下限。在这三种情况下,卫星RTT明显低于这一下限。为了比较,还展示了这些城市之间连接良好的互联网路径的延迟。

alt text

We deliberately optimized laser paths for East-West traffic. The latitudes of San Francisco, New York, London, and Singapore are 37.7°N, 40.8°N, 51.5°N and 1.4°N, so although paths between then do not directly travel East-West, there is a large East-West component. What about North-South routes?

我们故意优化了东西向流量的激光路径。旧金山、纽约、伦敦和新加坡的纬度分别为37.7°N、40.8°N、51.5°N和1.4°N,因此尽管它们之间的路径并未直接向东西行进,但仍然存在较大的东西向成分。那么,南北方向的路径又如何呢?

The red curve in Figure 9 shows the London-Johannesburg route. The satellite path has almost half the 182 ms latency of the best Internet path via fiber off the west coast of Africa. However, the satellite path is nowhere near optimal, as it has to zig-zag(锯齿状)via SW and SE links. Can we do better?

图9中的红色曲线显示了伦敦到约翰内斯堡的路径。通过卫星的路径几乎只有182毫秒延迟的一半,而最佳互联网光纤路径则是从非洲西海岸出发。然而,卫星路径并不接近最优,因为它必须通过西南(SW)和东南(SE)链路进行曲折传输。我们能否做得更好?

Phase 2 Routing SpaceX’s proposals for phase 2 include another 1,600 satellites in 53.8°inclination orbits. These orbits closely parallel the 53°orbits of the phase 1 satellites, but they are 40 km lower so complete an orbit in 53 seconds less than phase 1 (a complete orbit takes ≈107 minutes). As with the phase 1 satellites, it makes most sense to use the first laser pair to connect along the orbital plane. We now have a choice of how to use the remaining lasers. We experimented with53.8°sconnectingadjacent53°and atellites, but the velocity difference makes this problematic the direct East-West routing paths slowly become zig-zag before eventually the satellites switch to the next neighbor, and this adversely affects latency. To avoid this drift problem, we conclude that 53.8°satellites should connect to 53.8°satellites in the next orbital plane, even though they are more distant.

SpaceX的第二阶段提案包括在53.8°倾斜轨道上增加1600颗卫星。这些轨道与第一阶段卫星的53°轨道紧密平行,但高度低40公里,因此完成一圈所需时间比第一阶段少53秒(完整轨道大约需要107分钟)。与第一阶段卫星一样,使用第一对激光连接沿轨道平面进行通信是最合理的选择。我们现在可以选择如何使用剩余的激光。

我们尝试将53.8°卫星连接到相邻的53°卫星,但由于速度差异,这会造成问题,直接的东西向路由路径会逐渐变成锯齿形,最终卫星会切换到下一个邻居,这会对延迟产生不利影响。为了避免这种漂移问题,我们得出结论,53.8°卫星应该连接到下一个轨道平面中的53.8°卫星,即使它们之间距离更远

alt text

As figure 1 shows, the best phase offset between neighboring planes is 17/32. This larger phase offset changes the options for the orientation of paths created by connecting to the neighboring orbital plane. We already have good NW-SE, NE-SW, and East-West connectivity from phase 1, and routing along the phase 2 orbital planes will increase the NW-SE and NE-SW capacity. Using the remaining lasers to improve the North-South direction is an attractive option. To do this, we offset the lasers by 2, connecting satellite n in plane p to satellite n−2 in plane p−1 and n+2 in plane p+1. Figure 10 shows just the side laser links of 53.8° satellites using this offset. We cannot achieve perfect N-S orientation, but the paths are very good at higher latitudes.

如图1所示,邻近平面之间最佳的相位偏移为17/32。这一较大的相位偏移改变了连接到相邻轨道平面时路径方向的选择。我们已经 从第一阶段获得了良好的西北-东南、东北-西南和东西向的连通性,而 沿第二阶段轨道平面的路由将增加西北-东南和东北-西南的容量 。利用剩余的激光来改善南北方向是一个有吸引力的选择。为此,我们将激光偏移2,连接平面p中的卫星n与平面p-1中的卫星n-2以及平面p+1中的卫星n+2。图10仅显示了使用该偏移的53.8°卫星的侧激光链路。我们无法实现完美的南北方向,但在高纬度地区路径表现非常良好。

These N-S paths are complemented by the satellites in higher inclination orbits. For these there are only a few orbital planes too far apart to allow connections between neighboring planes, except near the poles. We use their remaining three lasers less methodically, allowing them to to opportunistically connect to each other or to 53°and 53.8°orbits as they come close. This provides reasonable polar coverage while allowing them to be used for N↔S traffic at lower latitudes.

这些南北路径得到了高倾斜轨道卫星的补充。在这些轨道中,仅有少数轨道平面距离太远,无法在邻近平面之间建立连接,除非在极地附近。我们对它们剩余的三条激光链路使用得不那么系统,允许它们在接近时机会性地彼此连接或与53°和53.8°轨道连接。这提供了合理的极地覆盖,同时允许它们在低纬度地区用于南北交通。

alt text

The blue curve in Figure 9 shows that adding the phase 2 satellites has improved the London-Johannesburg latency by about 20% due to the more direct routing. The purple curve shows the second best path, calculated by removing all links used by the best path, and re-running Dijkstra on the remaining graph. This indicates that latency in such a network is not critically dependent on any one satellite or link.

图9中的蓝色曲线显示,添加第二阶段卫星使伦敦到约翰内斯堡的延迟改善了约20%,这是由于更直接的路由。紫色曲线显示了 第二最佳路径 ,该路径通过 移除最佳路径所用的所有链路并重新运行Dijkstra算法计算剩余图 而得出。这表明,在这样的网络中, 延迟并不严重依赖于任何一颗卫星或链路

Multipath While the biggest advantage to a dense LEO constellation is likely to be very low latency, the bandwidth of a single satellite path is likely to be insufficient to impact the business of long distance fiber networks. However a LEO constellation can provide many paths between the same city pair; with Starlink there may be 60 satellites within coverage range for latitudes close to 50°N. How does the latency of these additional paths compare with the best path?

尽管密集的低地球轨道(LEO)星座的最大优势可能是非常低的延迟,但单个卫星路径的带宽可能不足以影响长距离光纤网络的业务。然而,LEO星座可以在相同城市对之间提供多条路径;在接近50°N纬度的地区,Starlink可能有60颗卫星在覆盖范围内。这些额外路径的延迟与最佳路径相比如何?

alt text

Generally, the longer the distance, the more paths will be available that have lower latency than the best theoretical fiber path. New York and London are relatively close, so the potential satellite gain is lower. Both are major financial centers, so there is a great deal of latency-sensitive traffic. In Figure 11 we show the RTT over three minutes for the best 20 disjoint paths between them. This is calculated iteratively; first we run Dijkstra to calculate the best path, then we remove all the RF uplinks and laser links used by that path from the network graph. We then re-run Dijkstra to find the next best path, eliminate those links, and iterate. With this formulation, no satellite overhead either city can provide more than one up or downlink, and no intermediate satellite can be used by more than two paths. This implicitly assumes that laser links and RF links have the same capacity - this is unlikely in reality; whichever turns out to be the bottleneck, a real network will allow more paths than this, so the figure effectively shows an upper bound on path latency.

通常来说,距离越长,可用的路径越多,这些路径的延迟低于最佳理论光纤路径。纽约和伦敦相对较近,因此潜在的卫星增益较小。两者都是主要金融中心,因此存在大量对延迟敏感的流量。在图11中,我们展示了这两个城市之间最佳20条不相交路径的往返时间(RTT)在三分钟内的变化。这是通过迭代计算得出的;首先运行Dijkstra算法计算最佳路径,然后从网络图中移除该路径所用的所有射频(RF)上行链路和激光链路。接着重新运行Dijkstra算法以找到下一个最佳路径,消除那些链路,并继续迭代。通过这种方法,任何城市上空的卫星都无法提供超过一个上行或下行链路,并且没有中间卫星可以被超过两条路径使用。这隐含地假设了激光链路和RF链路具有相同的容量 —— 虽然这在现实中不太可能;无论哪个最终成为瓶颈,实际网络将允许比这更多的路径,因此该图有效地显示了路径延迟的上限。

single path data

这张图里显示了20条路径的“每条路径RTT”,我们可以分析出它们都比传统Internet的路径RTT短

实际上现实生活里会更有优势,因为现实中它们是协同合作的,multi-path一起传输

There are five paths that have lower latency than the great-circle fiber path, and all 20 paths have lower latency than the current Internet path. However, latency variability increases as the path gets worse: path 20 has much more variable latency than path 1, as it has fewer options available. In figure 12 we see the one-way latency of path 20 in more detail. 10% variability is likely insufficient to trigger spurious TCP timeouts, and increases in RTT are also unlikely to impact TCP. However, when latency decreases rapidly, reordering will occur, causing TCP to incorrectly assume a loss has occurred and triggering a fast retransmit.

有五条路径的延迟低于大圆光纤路径,所有20条路径的延迟均低于当前互联网路径。然而,随着路径变得更差,延迟变异性增加:第20条路径的延迟变异性远高于第1条,因为可用选项更少。在图12中,我们更详细地观察第20条路径的一次性延迟。10%的变异性可能不足以引发虚假的TCP超时,并且RTT增加也不太可能影响TCP。然而,当延迟迅速下降时,会发生重排,导致TCP错误地假设发生了丢失,从而触发快速重传。

延迟变异性

延迟变异性(Latency Variability)是指在数据传输过程中,延迟(即数据包从源头到达目的地所需的时间)在不同时间点之间的波动或变化程度。具体来说,它反映了延迟的稳定性和一致性。

  1. 波动性:如果某条路径的延迟在不同的数据包传输中变化很大,那么我们就说这条路径具有较高的延迟变异性。相反,如果延迟保持相对稳定,那么变异性就较低。
  2. 影响因素:延迟变异性可能受到多个因素的影响,包括网络拥塞、链路质量变化、路由变化、数据包丢失和重传等。在网络条件不佳时,某些路径可能会经历较大的延迟波动。
  3. 对应用的影响:在实时应用(如视频通话、在线游戏等)中,高延迟变异性可能导致用户体验不佳,因为数据包到达的时间不一致会导致音频或视频的不同步。在TCP协议中,较高的延迟变异性可能引发虚假的超时,从而导致不必要的数据重传。

示例

在文中提到的第20条路径,其延迟变异性远高于第1条路径。这意味着第20条路径在不同时间点传输数据包时,所经历的延迟差异更大,可能表现为某些数据包的传输时间显著长于其他数据包。这种情况可能会导致TCP协议误判网络状况,从而触发快速重传机制,影响整体网络性能。

10%的变异性可能不足以引发虚假的TCP超时,并且RTT增加也不太可能影响TCP

Background (RTT measurement)

RTT测量:TCP会持续测量RTT,并计算出一个加权平均值(通常称为平滑RTT),并基于此值设置超时时间。由于这个自适应机制,TCP能够适应网络延迟的变化。

为什么10%的变异性可能不足以引发虚假的TCP超时

  • 相对稳定性:如果某条路径的RTT在不同传输中 波动10%,这通常被认为是相对稳定的范围 。在许多情况下,这种小幅度的波动不会显著影响TCP的超时判断,因为TCP已经针对这种变化进行了优化。
  • 容忍度:TCP协议设计时考虑了网络延迟和变异性,因此它能够 容忍一定程度的延迟波动,而不会立即触发重传机制 。10%的变异性通常在可接受范围内,不会导致误判。

为什么RTT增加对TCP的影响不大

  • 平滑RTT计算:即使RTT有所增加,TCP会根据新的RTT值重新计算平滑RTT,并相应地调整超时时间。如果增加幅度在合理范围内,TCP仍然能够适应这种变化。
  • 拥塞控制机制:TCP还具有 拥塞控制机制,当检测到网络条件恶化时,会降低发送速率。这种机制进一步增强了TCP对延迟变化的抵抗力。