跳转至

xeoverse: Real-Time Simulation Framework for LEO Satellite Networks

  • 核心背景:
    • 低地球轨道 (LEO) 卫星网络 (如 Starlink) 不仅规模巨大, 而且具有高动态性 (卫星快速移动, 链路频繁切换)
    • 现有的互联网协议是为静态地面网络设计的, 不完全适用, 且在真实星座上进行测试昂贵且困难
  • 现有问题:
    • 现有的模拟器 (如 Hypatia, StarryNet) 在模拟大规模星座时, 往往面临扩展性 (Scalability), 响应速度 (Responsiveness) 和保真度 (Fidelity) 之间的权衡, 难以在单机上实现大规模的实时模拟

Xeoverse 是一个旨在支持大规模 LEO 卫星网络研究的可扩展, 高保真, 实时网络模拟器. 它基于 Mininet 构建, 并在单台机器上实现了对完整 Starlink Shell 1 (1584颗卫星) 的实时模拟

(1) 核心设计理念

  1. 预计算拓扑 (Pre-computing):
    • 利用卫星轨道运动的可预测性 (Observation I), 预先计算所有时间点的网络拓扑, 链路特征和路由表, 从而减少模拟运行时的计算负担
  2. 增量更新 (Updating only changes):
    • 尽管卫星移动迅速, 但在短时间内只有少量星间链路 (ISL) 发生变化 (Observation II)
    • Xeoverse 仅更新发生变化的链路
  3. 按需模拟 (Focusing on relevance):
    • 在特定模拟场景中, 并非所有卫星都参与数据传输
    • Xeoverse 仅更新与当前模拟流量相关的节点和链路 (Observation III), 大幅降低计算开销

(2) 架构流程

  • 后台 (Back Stage): 处理卫星轨道数据 (TLE), 生成星座几何结构, 拓扑连接矩阵, 链路特征 (延迟, 容量, SNR, 天气影响) 以及路由表
  • 主台 (Main Stage): 基于 Mininet 运行, 根据后台预计算的数据, 通过后台进程实时更新网络拓扑和路由, 支持运行标准的 Linux 网络工具 (如 iperf, ping)

(3) 性能评估结果

  • 速度: 在模拟更新阶段, Xeoverse 比 Hypatia 快 2.3倍, 比 StarryNet 快 40倍. 它实现了实时模拟 (1秒模拟时间 = 1秒现实时间)
  • 扩展性: 单机即可模拟数千颗卫星 (如 Starlink 的 5442 颗卫星), 而 StarryNet 在单机上受限于 Docker 网桥限制 (约1023个节点)
  • 保真度: 引入了天气条件 (雨/雪) 对链路的影响模型. 实验显示, Xeoverse 对吞吐量的预测与真实 Starlink 网络的误差在晴天仅为 2%, 雨天为 16%
Danger

这文章一没投正规论文, 二不开源代码仓库 (尤其是作为模拟器类型的文章, 竟然不开源, 明显是作者团队底气不足)

伪的离谱, 纯sb论文, 建议跳过...

一想到未来做 LEO Emu 方向的人, 还要引用这篇文章, 就觉得很搞...

这里引用 ATC'25 LEOCraft 的原文, 浅喷一下:

"Another LEO network emulation platform xeoverse [63] is built upon a lightweight process-based network emulator platform Mininet [20] and claims to outperform Hypatia and StarryNet by large margins; however, their source code is not public for community use."

Introduction

The rapid advancement in satellite and space technology, especially with the development of reusable rockets and the extensive deployment of Low-Earth Orbit (LEO) satellites, is transforming the way we access the Internet from space. Unlike the traditional geostationary (GEO) satellites stationed 35, 000 KM above Earth, LEO satellites orbit at much lower altitudes, between 500 and 2, 000 KM [1], [2], reducing endto-end latency significantly. This new space race, led by Starlink, OneWeb, and Kuiper, aims to provide low-latency, high-bandwidth Internet coverage across the globe, including the 85% of Earth’s surface that currently lacks reliable connectivity (a drawback which has impeded adoption of traditional broadband applications even in developed countries [3]).

However, the dynamic nature of LEO satellite networks, with their high-speed movement and frequent changes in InterSatellite Links (ISLs), poses a challenge to the existing terrestrial Internet protocols and algorithms, which were designed for a static infrastructure. For instance, measurements have shown the traditional congestion control protocols do not perform optimally on Starlink [4].

This creates a dire need for simulation platforms that allow end-to-end control over all aspects of LEO mega-constellations-based architectures. It is challenging to create a high-fidelity platform that (i) faithfully replicates details such as satellite movements at 27,000 KM/hr, the associated frequent link changes, the detailed link representation in terms of RF parameters and signal-to-noise ratio (SNR) and the effect of local weather conditions on links across the globe, (ii) can do this whilst scaling to the thousands of satellites which are already deployed on constellations such as Starlink and the tens of thousands of possible links, while at the same time being (iii) responsive – enabling fast simulations that do not slow down with the growing scale of the constellations and the fidelity of the simulation. Indeed, we ideally want a simulator that operates in real-time, i.e., one second of simulation time is executed in one second of wall-clock time, without compromising on the fidelity or the scalability. To be accessible, we also want (iv) a low footprint simulator, that does not require too many machines and high compute power.

Our main contribution in this paper is the design and implementation of xeoverse , a high-fidelity real-time simulator that can simulate the entire current Starlink constellation (5,442 satellites) on a single 26-cores 64GB machine. Although we focus on Starlink because it is currently the largest constellation available, our simulator can be adapted to any other constellation where the positions of the satellites can be described using the standard Two-Line Element (TLE) Set [5].

xeoverse relies on three key observations to create a scalable yet responsive architecture without compromising on simulation fidelity: The first observation is that while the movement of satellites causes significant changes in the global topology of LEO constellations across different time points, these movements are predictable since the satellites are in fixed orbits. Thus, we can pre-compute the topology for each time point of interest in the simulation, ahead of the running the actual simulation itself. Then, as long as the topology for a given time point can be instantiated in real-time, we can ensure the whole simulation proceeds in real-time.

Our second and third observations help to efficiently manage the topology changes from one time point to another: The second observation is that although there is constant link churn due to satellite movement, most links stay the same over small time periods. Thus link changes can be a manageably small in number from one time point to the very next time point in high fidelity simulations that require high time resolution (we support time changes down to 1ms, although FCC filings suggest that Starlink satellite handovers happen on the order of 15 seconds or more [6]). xeoverse scales by computing the subset of links which change from one time point to the next. The third observation goes further by noticing that in many simulation use cases, the users are only interested in a subset of the entire constellation (e.g., a train operator using the simulator to understand whether Starlink or OneWeb offers better coverage over a train route in France may not need all details of satellites over North America to be updated). Given a traffic matrix of flows of interest, xeoverse computes which elements (user terminals, satellites, and gateways) of the entire constellation carry this traffic of interest, and applies dynamic topology updates only for those elements, greatly enhancing scalability and decreasing compute footprint.

We compare xeoverse with two state-of-the-art alternatives: Hypatia [7] and StarryNet [8]. We focus on a low footprint setting, using only one machine. Our evaluations show that when simulating one flow over the entirety of Starlink Shell 1, xeoverse is more responsive than both: it is 40x times faster than StarryNet, and 2.3x times faster compared to Hypatia. xeoverse scalability advantages become more apparent as the number of flows are increased – In our single machine setting, StarryNet fails to scale beyond around 1023 satellites, thus failing to simulate even Shell 1 of Starlink. Hypatia slows down, taking 5x times more time than xeoverse in the simulation updates phase. Furthermore, xeoverse adds more link-level details than both Hypatia and StarryNet, allowing us to show differences between different weather conditions, similar to that observed on a real node. xeoverse predicted throughput is within 16% (2%) of observed throughput during rainy (clear weather) conditions. Over a long distance link (London–San Francisco), xeoverse is able to better reflect the changes in throughput due to temporary disruptions of Inter Satellite Links as compared to Hypatia (StarryNet is unable to scale to this scenario due to the number of satellites involved).

1. 引言

卫星和空间技术的迅猛发展, 特别是可重复使用火箭技术的进步以及低地球轨道 (LEO) 卫星的广泛部署, 正在彻底变革我们从太空接入互联网的方式. 与驻留在地球上方 35,000 公里的传统地球静止轨道 (GEO) 卫星不同, LEO 卫星的轨道高度要低得多, 通常在 500 到 2,000 公里之间 [1], [2], 从而显著降低了端到端延迟. 这场由 Starlink, OneWeb 和 Kuiper 引领的新太空竞赛, 旨在全球范围内提供低延迟, 高带宽的互联网覆盖, 包括覆盖目前缺乏可靠连接的 85% 地球表面 (这一短板甚至阻碍了传统宽带应用在发达国家的普及 [3]).

然而, LEO 卫星网络的高动态特性——表现为高速运动和星间链路 (ISL) 的频繁变化——对现有的地面互联网协议和算法构成了挑战, 因为这些协议是为静态基础设施设计的. 例如, 测量结果表明, 传统的拥塞控制协议在 Starlink 上的性能表现并未达到最优 [4].

这就迫切需要一种能够对基于 LEO 巨型星座架构的各个方面进行端到端控制的仿真平台. 创建一个高保真平台极具挑战性, 它需要满足以下条件:

(i) 忠实地复现细节, 例如时速 27,000 公里的卫星运动, 随之而来的频繁链路变化, 基于射频参数和信噪比 (SNR) 的详细链路表示, 以及全球各地的局部天气条件对链路的影响

(ii) 能够扩展以支持 Starlink 等星座已部署的数千颗卫星及数万条可能的链路

(iii) 具有高响应性——能够实现快速仿真, 且不会随着星座规模扩大和仿真保真度提高而降低速度

事实上, 我们理想的目标是拥有一个能够实时运行的仿真器, 即 1 秒的仿真时间对应 1 秒的物理时间 (Wall-clock time), 且不牺牲保真度或可扩展性.

为了易于部署, 我们还希望拥有 (iv) 一个低资源占用 (low footprint) 的仿真器, 不需要过多的机器和高昂的算力.

本文的主要贡献是设计并实现了 xeoverse, 这是一个高保真的实时仿真器, 能够在单台 26 核 64GB 的机器上仿真整个当前的 Starlink 星座 (5,442 颗卫星). 尽管由于 Starlink 是目前最大的可用星座我们将重点放在其上, 但我们的仿真器可以适配任何其他能使用标准两行轨道根数 (TLE) 集 [5] 描述卫星位置的星座.

Xeoverse 依赖于三个关键观察结果, 以在不牺牲仿真保真度的前提下构建兼具可扩展性和高响应性的架构:

第一个观察是, 虽然卫星的运动导致 LEO 星座的全局拓扑结构在不同时间点发生显著变化, 但由于卫星处于固定轨道, 这些运动是可预测的. 因此, 我们可以在实际运行仿真之前, 预计算仿真中每个感兴趣时间点的拓扑结构. 只要给定时间点的拓扑结构能够被实时实例化, 我们就能确保整个仿真过程实时进行.

我们的第二个和第三个观察有助于高效地管理从一个时间点到下一个时间点的拓扑变化:

第二个观察是, 尽管由于卫星运动存在持续的链路更替, 但大多数链路在短时间内保持不变. 因此, 在需要高时间分辨率的高保真仿真中 (虽然 FCC 文件显示 Starlink 卫星切换发生在 15 秒或更长时间的数量级上 [6], 但我们支持低至 1 毫秒的时间变化), 从一个时间点到紧接的下一个时间点, 链路变化的数量是可控且较少的. Xeoverse 通过计算从一个时间点到下一个时间点发生变化的链路子集来实现扩展.

第三个观察更进一步指出, 在许多仿真用例中, 用户只对整个星座的一个子集感兴趣 (例如, 列车运营商使用仿真器来评估 Starlink 或 OneWeb 是否在法国的列车线路上提供更好的覆盖, 可能不需要更新北美上空卫星的所有细节). 给定感兴趣流的流量矩阵, Xeoverse 会计算整个星座中哪些元素 (用户终端, 卫星和网关) 承载了这些感兴趣的流量, 并仅对这些元素应用动态拓扑更新, 从而极大地增强了可扩展性并降低了计算资源占用.

我们将 xeoverse 与两个最先进的替代方案进行了比较: Hypatia [7] 和 StarryNet [8]. 我们专注于低资源占用环境, 仅使用一台机器. 评估结果显示, 当在整个 Starlink Shell 1 上仿真单条流时, xeoverse 的响应速度优于两者: 比 StarryNet 快 40 倍, 比 Hypatia 快 2.3 倍. 随着流数量的增加, xeoverse 的可扩展性优势变得更加明显——在我们的单机设置中, StarryNet 无法扩展超过约 1023 颗卫星, 因此甚至无法仿真 Starlink 的 Shell 1. Hypatia 的速度则变慢, 在仿真更新阶段花费的时间是 xeoverse 的 5 倍. 此外, xeoverse 增加了比 Hypatia 和 StarryNet 更多的链路级细节, 使我们能够展示不同天气条件下的差异, 这与在真实节点上观察到的情况相似. Xeoverse 预测的吞吐量在雨天 (晴天) 条件下与观测吞吐量的误差在 16% (2%) 以内. 在长距离链路 (伦敦-旧金山) 上, 与 Hypatia 相比, xeoverse 能够更好地反映由于星间链路暂时中断而导致的吞吐量变化 (由于涉及卫星数量众多, StarryNet 无法扩展到此场景).

Network simulators such as xeoverse serve as essential tools for examining network performance, testing new protocols, and applications on a large scale within a controlled, realistic environment before real-world deployment. NS-3, and its predecessor NS-2, are examples of such tools, which have enabled a large amount of networking research. NS-3 [9], a discrete-event network simulator has numerous modules (e.g., WiFi, 4G, 5G) developed atop it to expand its utility. Mininet [10], designed with a focus on Software-Defined Networking (SDN), represents another simulation platform. However, both NS-3 and Mininet are general-purpose network simulators and are not specifically tailored for simulating LEO mega-constellations of satellites.

Over recent years, the interest in LEO satellite networks has surged, with efforts broadly categorized into three areas based on realism: real-world constellation deployment, measurement campaigns, and simulation platforms.

Real-world constellation development. The Tiansuan constellation [11] adopts the stance that researchers need their own real-world constellation deployment for experimentation and full control over all aspects of the network. Despite the high-fidelity of this approach, it also comes with significantly high cost. To the best of our knowledge, only two Tiansuan satellites have been launched so far (http://www.tiansuan.org.cn/morenews.html, accessed 10 May 2024), and the plan is to launch up to 6 satellites in the first phase. It remains hard to match this to the scale of commercial closed mega-constellations e.g., Starlink, which has thousands of satellites. Thus, various important questions such as ISLs are difficult to study in this system.

Real-world measurement campaigns. Unfortunately, researchers do not have direct access to the satellites of commercially deployed mega constellations such as Starlink. Therefore, given the barriers to setting up a parallel realworld research constellation, the research community has made notable strides in real-world measurement studies on live and operational commercial LEO satellite networks [4], [12]–[16]. Such studies focus mainly on Starlink, with other constellations (e.g., OneWeb, Telesat) being less explored. These efforts, in addition, are limited to data from end devices (i.e., user terminals), and lack direct insights into the network’s inner workings, such as gateways or satellites. This limitation highlights the persistent need for simulation software. While these measurement efforts treat the constellation as a black box, simulations allow in-depth exploration of design decisions within the constellation itself, such as satellite interconnectivity and queueing disciplines, etc. These simulations are crucial for deeply understanding the complexities of LEO satellite mega-constellations.

LEO Simulation Tools. Most of the early work on LEO networks such as [17], [18] developed custom-built simulators to address specific research questions. More recently, general-purpose LEO simulators such as Hypatia [7], Celestial [19], and StarryNet [8] have been developed. Hypatia, built on top of NS-3, provides packet-level analysis but faces scalability challenges due to computational overhead. Celestial, developed using SILLEO-SCNS, is exclusively focused on edge computing within LEO networks. StarryNet utilizes Docker virtualization to simulate satellite and ground networks; however, it encounters scalability limits due to Docker’s constraints. StarryNet and Hypatia align most closely with our approach, aiming to simulate entire mega-constellations comprehensively. We discuss these approaches in detail in §III and evaluate our simulator against Hypatia and StarryNet in §V.

Other Simulation Tools. High-fidelity link simulators, such as MATLAB SIMULINK [20] and ATK STK [21], provide detailed simulations of ground-to-satellite communication links. These tools consider various factors affecting communication quality, including signal loss, the Doppler effect, modulation techniques, antenna types, beams and radio frequency (RF) parameters. While effective for modeling individual links, these simulators are not designed to manage the complexity of entire mega-constellations. They focus on the simulation of communication links rather than the overall network performance or the dynamic interactions within large LEO satellite networks. Consequently, while useful for detailed link analysis, these tools do not offer insights into the broader network characteristics and performance of satellite constellations.

主要从五个维度回顾了现有的研究现状, 并指出了各自的局限性, 从而引出开发 Xeoverse 的必要性.

  1. 通用网络模拟器 (General Simulators)

    • 现状: 现有的 NS-3 和 Mininet 等工具在通用网络研究中应用广泛
    • 局限: 它们并非专为 LEO 巨型星座设计, 缺乏针对卫星网络特性的优化
  2. 真实星座部署 (Real-world Constellation Development)

    • 现状: 如"天算星座" (Tiansuan) 试图通过发射科研卫星来获得高保真的实验环境
    • 局限: 成本极高且规模难以扩大 (目前仅发射少量卫星), 无法模拟像 Starlink 这样拥有数千颗卫星的商业巨型星座, 难以研究星间链路 (ISL) 等复杂问题
  3. 真实网络测量 (Real-world Measurement Campaigns)

    • 现状: 研究者对 Starlink 等现网进行测量研究
    • 局限: 商业网络是"黑盒", 研究者只能从用户端 (User Terminals) 获取数据, 无法窥探网络内部运作 (如网关, 卫星排队机制等), 因此仍需模拟软件来深入研究内部机制
  4. 现有的 LEO 专用模拟器 (LEO Simulation Tools)

    • Hypatia: 基于 NS-3, 提供包级分析, 但计算开销大, 扩展性受限
    • StarryNet: 基于 Docker, 受限于 Docker 的约束, 扩展性也有限
    • Celestial: 仅专注于 LEO 边缘计算, 功能单一
  5. 物理链路模拟器 (Other Simulation Tools / Link Simulators)

    • 现状: 如 MATLAB SIMULINK 和 Ansys STK
    • 局限: 这些工具擅长模拟单条链路的物理细节 (如射频参数, 多普勒效应, 信号衰减), 但无法处理整个巨型星座的复杂网络交互和整体性能

xeoverse's Design Approach

A. Desiderata

In developing a simulator for LEO mega-constellation networks, it is imperative to outline key design requirements that ensure its effectiveness. These are foundational to replicating the complexity and dynamics of satellite networks. We start first by defining these desiderata, and then discuss how we achieved these characteristics in xeoverse :

D1. Scalability: The need for extreme scalability distinguishes LEO networks from Geostationary satellite networks, as LEO requires thousands of satellites for global coverage compared to GEO – Starlink Shell 1 has 1584 satellites, and there are 5400+ satellites overall. This mega constellation size presents a significant computational challenge for simulations, as accurately representing the complex dynamics of LEO networks requires modeling of numerous satellites and their interactions.

D2. Responsiveness: As the network grows and the simulation becomes more computationally demanding, there is a risk that the simulator could slow-down, affecting its ability to run in real-time or near-real-time. This is particularly concerning for testing production-grade applications, where any discrepancy in timing (clocking) between the simulation and real-world operation can lead to inaccurate results. Slow simulations also make nonurgent simulations less agile, since it is common to rerun simulations multiple times for statistical validity, or with slightly changed assumptions to test different “what if” scenarios. We desire a simulator which runs in real time, i.e., 1-second of simulation requires 1-wall clock second.

D3. Fidelity: The cornerstone of any simulator is its ability to realistically mimic the real-world environment of the mega-constellation satellite networks. This encompasses accurately simulating link characteristics, antenna and RF parameters, and the impact of weather conditions. Furthermore, having the interfaces to allow incorporating production-grade transport and application protocols is essential for high-fidelity simulations.

D4. Low Footprint: Finally, we require that the simulator can achieve its functionality with low computational footprint (in terms of CPU and memory usage, as well as in terms of the number of machines or virtual machines required.).

B. Previous approaches

There are different design approaches to build network simulators for LEO satellite mega-constellations. We review two prominent approaches, and compare them with our desiderata.

Hypatia is a network simulator designed for LEO networks, as an add-on module on top of the well-known NS-3 simulator. The choice to build upon NS-3 offers significant advantages, particularly in the ease of developing and testing new transport and application layer protocols for LEO networks, given the extensive tooling offered by NS-3 as well as implementations of several major protocols, including experimental proposals from various research papers. Furthermore, NS-3 protocols are implemented in user-space, which is significantly simpler than in-kernel implementations.

However, this approach is not without its limitations. The reliance on NS-3, which is a discrete event network simulator, introduces challenges in meeting the above design requirements of scalability and responsiveness. The primary issue stems from the large volume of events Hypatia must process in a short time to accurately simulate the interactions and behaviors of dynamic LEO networks. This event-driven simulation model, while detailed, poses scalability (D1) and responsiveness (D2) challenges due to the computational demands of processing a large number of events, especially in scenarios featuring high traffic volumes and large number of nodes as highlighted in §V.

StarryNet adopted a different approach to simulate LEO mega-constellation networks, utilizing Docker container technology to create this simulator. Each network element, including user terminals, ground stations and satellites, is represented by a Docker container, with virtual network interfaces established between them to simulate ISLs and GSLs.

This overcomes some of the scalability problems of the discrete event simulation approach of Hypatia, but there are still certain challenges, especially regarding scalability, responsiveness and footprint: The Docker’s bridge interface, which is essential for StarryNet’s simulation to emulate the ISLs and GSLs, imposes a limit on the number of containers that can be attached to a single virtual bridge interface. Specifically, the maximum is capped at 1023 containers 1 , effectively restricting the simulator to representing no more than 1023 ground segments plus satellites on a single machine (impacting D1). To address larger constellations, StarryNet employs distributing the simulation across multiple machines, increasing setup complexity due to the intricate configuration of intercontainer links, and affecting D4, the footprint required for the simulation. Furthermore, StarryNet’s choice of implementation introduces performance bottlenecks. In particular, the way in which StarryNet employs threading encounters issues with the Global Interpreter Lock (GIL) in Python (detailed in §V). StarryNet spawns one thread per link to update these link characteristics, resulting in a total number of threads equal to the number of links in the megaconstellation. In §V we show how this mechanism, which prevents multiple native threads from executing at the same time, significantly slows down the StarryNet simulation as the size of the constellation increases.

C. xeoverse Overview

Fig.1 presents a high-level schematic of xeoverse , developed to satisfy the above desiderata. With scalability (D1) in mind, we based xeoverse on Mininet, a network emulation platform that relies on lightweight process-level virtualization technology to emulate thousands (up to 4096) of network nodes on commodity machines, allowing a small footprint for the entire constellation (D4).

Mininet represents each network node (or host) as a lightweight virtual process (D4) that is connected to one or many virtual interfaces using a Linux Kernel feature called network namespace. This feature allows each network node to have isolated network environments and routing tables.

The fact that Mininet is developed over the standard Linux Kernel and each network node is a virtualized process allows many Linux-based network testing applications, e.g., iperf, ping, and traceroute., to be easily deployed on these nodes, helping fidelity (D3). This also facilitates testing the performance of the state-of-the-art network/application/transportlayer protocols (e.g., BBR, CUBIC, HTTP/3) that have been already developed or have reference implementations in standard Linux (D3). In addition, xeoverse achieves better fidelity (D3) by modeling link characteristics to include both ground segment and satellites RF parameters and other influencing factors, such as weather conditions, which are not addressed by Hypatia and StarryNet.

Each satellite, user terminal and gateway is represented as a separate Mininet node. These are connected together to form the global satellite constellation for the simulation scenario under consideration. The challenge lies in the fact that there are thousands of nodes in LEO mega constellations, and connections are made and broken as the satellites move over the user terminals and gateways on earth, creating constant topology changes. Our approach to responsively (D2) managing this link churn inherent to LEO mega constellations is informed by three key observations:

I. Satellite movements are predictable. Satellites orbit the Earth in predefined paths (orbits) at constant speeds. This predictability allows us to pre-compute, for any given time point, the position of all satellites in a constellations, their link characteristics, as well as the entire network topology inter-connectivity among satellites. This significantly reduces the need for real-time calculations during the simulation.

Furthermore, we do this pre-computation phase only once for a given scenario, and then the main simulation uses this pre-computed data to rerun multiple times. This approach is particularly beneficial for various purposes e.g., ensuring statistical validity, testing different variants of the same protocol, or conducting comparisons of a new proposal against various baseline implementations. For instance, a new congestion control protocol can be compared against common TCP variants, CUBIC, BBR, etc., within the same simulation scenario, with each protocol tested using data generated by a one-time precomputation run. By separating the simulation into a precomputation phase and the main simulation, we significantly decrease the overall time required for the simulation. Statistical validity may require that the same experiment be run multiple times, which again benefits from reusing the pre-compute phase across the different runs.

II. Only few Inter-Satellite Links change over short time periods. Despite the fast mobility of LEO satellites – approximately 27,000 kilometers per hour – the frequency of link changes is relatively low. This observation is demonstrated in Fig.2, which depicts the number of ISL changes (on y-axis) per minute over the entire Starlink shell 1, – consisting of 1584 satellites, over a span of 25 hours, depicted on the x axis as minute 0 to minute 1500 (25*60). We use the grid topology to build the ISL links, and the total number of ISLs is 5,346 links. The figure uses colored dots to indicate the distribution of these ISL changes within each one-minute interval. Specifically, a red dot signifies that the ISL changes occurred in three separate instances within the minute, an orange dot represents ISL changes happening twice within the minute, and a green dot indicates a single occurrence of ISL changes in the minute. For example, in the upper-left part of the figure (highlighted with the blue circle), orange dots reflect 16 ISL changes per minute (, i.e., a mere 0.3% of the 5346 total ISL links), with these changes evenly split into two instances (hence orange), each comprising 8 ISL changes within the one-minute window. This observation suggests that it is unnecessary to update all satellite links continuously; instead, focusing on a small subset of changing links can substantially reduce computational demands.

III. Many links may not matter in a simulation scenario The necessity to update links can be further minimized by concentrating solely on those links that are crucial to the simulation’s current focus – namely, links that connect endpoints involved in the simulated scenario. For example, if a simulation runs flows on the North-South axis between end hosts in London and Johannesburg, with East-West cross traffic running between different European cities (e.g., Brest, France to Berlin, Germany; Zagreb, Croatia to Zaragoza, Spain, etc.), parts of the constellation over Asia or the Americas may not need to be simulated. By prioritizing updates to links of interest, we can further minimize computational demands.

Driven by Observation I, xeoverse is architecturally divided into two primary components: xeoverse Back Stage and xeoverse Main Stage 2 xeoverse Back Stage is a set of modules designed to leverage the predictable nature of LEO satellites. This segment of the xeoverse is responsible for the preliminary computations of the mega-constellation’s topology, routing paths, and link characteristics. By capitalizing on the predictability of satellite movements, xeoverse Back Stage enables xeoverse to perform simulations in real-time. The output of these modules is then fed to the xeoverse Main Stage (i.e., the Mininet emulation) to play the emulated scenario, and run the required applications.

Driven by Observations II & III, xeoverse Back Stage precomputes the topology changes on a second-by-second basis3 during the whole simulation timeframe. This information is then used to manage in real-time the connectivity between different nodes of all kinds (satellites, user terminals and gateways), rewiring only those links that have changed in the past second. Collectively, I, II and III thus help ensure D2.

基于对 xeoverse's Design Approach (xeoverse 的设计方法) 部分的仔细阅读, 我将这一部分的核心内容分点概括如下:

(1) 设计需求 (Desiderata)

在开发 LEO 巨型星座网络模拟器时, 作者确立了四个关键设计目标:

  • D1 可扩展性 (Scalability): 必须能够模拟包含数千颗卫星 (如 Starlink Shell 1 的 1584 颗) 和数万条链路的巨型星座
  • D2 响应性 (Responsiveness): 必须能够实时 (Real-time) 运行, 即 1 秒的模拟时间对应 1 秒的物理时间 (Wall-clock time), 避免因计算量大而变慢, 确保计时准确和实验效率
  • D3 保真度 (Fidelity): 必须高保真地还原真实环境, 包括射频参数, 天线特性, 天气影响, 并支持运行生产级的传输和应用层协议
  • D4 低资源占用 (Low Footprint): 必须在低计算资源 (如单机) 下运行, 不需要大量的机器集群

(2) 现有方案及其局限性

作者分析了两种主流方法, 并指出它们难以同时满足上述所有需求:

  • Hypatia (基于 NS-3):

    • 优点: 协议支持丰富, 易于开发新协议
    • 缺点: 作为离散事件模拟器, 在处理大量事件时面临严峻的性能挑战, 可扩展性 (D1) 和响应性 (D2) 差
  • StarryNet (基于 Docker):

    • 优点: 利用容器技术解决了一部分离散事件模拟的瓶颈
    • 缺点:
      • 受限于 Linux 网桥 (Bridge) 的限制, 单机最多支持 1023 个节点, 限制了单机可扩展性 (D1)
      • 多机部署增加了复杂性和资源占用 (D4)
      • 由于 Python 全局解释器锁 (GIL) 问题, 其多线程更新链路机制导致性能瓶颈, 影响响应性 (D2)

(3) Xeoverse 的核心设计 (Xeoverse Overview)

为了满足上述需求, Xeoverse 采用了以下核心设计策略:

基础架构:

  • 基于 Mininet 构建
  • Mininet 利用轻量级进程虚拟化技术, 可以在单机上模拟数千个节点, 满足低资源占用 (D4)可扩展性 (D1)
  • 节点是独立的网络命名空间, 支持运行标准 Linux 网络工具 (iperf, ping) 和协议 (BBR, CUBIC), 保证 保真度 (D3)

基于三大观察的优化策略 (针对响应性 D2):

alt text

  1. 观察 I: 卫星运动可预测 (Satellite movements are predictable):

    • 策略: 将模拟分为"后台 (Back Stage)"和"主台 (Main Stage)". 在后台利用 TLE 数据预计算所有时间点的拓扑, 链路特征和路由
    • 优势: 运行时只需回放预计算数据, 极大减少实时计算量; 预计算数据可复用
  2. 观察 II: 短时间内链路变化少 (Only few Inter-Satellite Links change):

    • 策略: 仅更新发生变化的链路, 而非重建整个拓扑
    • 依据: 数据显示 Starlink 星座中, 即便在 1 分钟内, 也只有极少比例 (约 0.3%) 的链路发生变化
  3. 观察 III: 许多链路与当前模拟无关 (Many links may not matter):

    • 策略: 仅更新 "承载当前模拟流量" 的相关节点和链路
    • 优势: 进一步降低不必要的计算开销