跳转至

Emulating Space Computing Networks with RHONE

The rapid advancement in satellite technology with the adoption of commercial off-the-shelf (COTS) devices and satellite constellation networking has given rise to Space Computing Networks (SCNs). While SCN research is typically conducted on experimental platforms due to high operational costs, the unique challenges of SCNs such as the harsh space environment (e.g., power and thermal constraints) and dynamic constellation networks – require special consideration. Existing platforms cannot fully replicate the SCN operating environment with high scalability. This paper introduces Rhone, an emulator that bridges these gaps by achieving both satellite- and constellation- level fidelity (the accurate replication of satellite and constellation states, including power, thermal, and network conditions, as well as application performance characteristics) while ensuring usability. Rhone adopts a two-phase emulation approach: i) an offline phase builds power, thermal, orbit, network, and computation models using real satellite telemetry data and hardware-in-the-loop chip mirroring, and ii) an online phase executes container-based emulation integrated with these models. Key components, the satellite COTS aligner and the satellite network aligner, dynamically align the containers with real satellite conditions. Evaluation shows Rhone ’s scalability to 700 satellites on a single node, with power and computation model errors under 5% and thermal model errors within 1.3 – 2.5°C. Two case studies – satellite network energy drain attack and real-time earth observation application – demonstrate Rhone’s capability to emulate satellite- and constellation-level dynamics.

卫星技术随着商用现成设备(COTS)和卫星星座网络的快速发展,催生了空间计算网络(SCN)。由于运营成本高昂,SCN研究通常在实验平台上进行,但SCN所面临的独特挑战如严酷的空间环境(例如功率和热量限制)及动态的星座网络,需要特殊考虑。现有平台无法在高可扩展性条件下完全复制SCN的运行环境。

本文提出了Rhone,一种能够实现卫星级和星座级高保真度(精确复制卫星及星座状态,包括功率、热量、网络条件以及应用性能特征)且具有良好可用性的仿真器。

Rhone采用两阶段仿真方法:i)离线阶段利用真实卫星遥测数据和硬件在环芯片镜像建立功率、热量、轨道、网络及计算模型;ii)在线阶段结合上述模型执行基于容器的仿真。

关键组件“卫星COTS对齐器”和“卫星网络对齐器”动态将容器与真实卫星状态对齐。评估显示,Rhone在单节点可扩展至700颗卫星,功率和计算模型误差低于5%,热量模型误差为1.3–2.5°C。两项案例研究——卫星网络能耗攻击与实时地球观测应用,验证了Rhone仿真卫星和星座动态的能力。

Introduction

Low-earth orbit (LEO) satellites are evolving from large, singular, and expensive nodes focused solely on communication into networked computing systems in space, which we term Space Computing Networks (SCNs). Behind this are two technological trends – i) Adoption of COTS devices. Conventional low-performance computing components (e.g., 50 MHz CPU with 64 MB RAM [1]) specialized for satellites is being replaced by commercial off-the-shelf (COTS) hardware (e.g., Raspberry Pi [2], FPGA [3], GPU [4]), which has better performance and lower costs. ii) Satellite constellation networking. Satellites are no longer limited to low-data-rate communication, but high-bandwidth constellations providing global coverage [5, 6, 7, 8, 9, 10]. Starlink [5], with over 6,000 satellites in low Earth orbit (LEO), provides global connectivity with approximately 100 Mbps downlink and 10 Mbps uplink speeds [7]. Planet Dove constellation, comprising 150 satellites, specializes in Earth observation tasks [11].

SCNs have revolutionized two categories of satellite applications: i) Earth observation (EO). SCNs employ COTS devices for in-orbit image processing, such as compression, filtering, and detection [12, 13, 14, 15, 16], reducing the volume of data transmitted. Satellite constellation networking eliminates the need to wait for satellites to pass over ground stations for data transmission. Together, these enable nearreal-time EO in both industry [11, 17, 18] and academia [19, 15, 16] with SOTA capture-to-insight latency of 6 minutes [15], while conventional approaches take tens of hours to months, as reported by [15, 18]. Academia is actively exploring EO with higher efficiency and lower latency [19, 20] jointly involving onboard processing and data transmission for captured images. ii) Satellite-backboned global networking. Satellites are providing global network services, where on-satellite COTS devices enable the potential of satellites as in-orbit computing nodes to perform network functions. For example, satellites can serve as meet-up servers processing and routing video streams for cross-continent video conferencing [21]. Satellites can also integrate cellular core functionalities [22, 23, 24, 25, 26], enabling the processing of heavy user traffic (e.g., User Plane Function (UPF) in the 5G core) using COTS chips. Satellite network attacks and countermeasures also raise concerns [27, 28], as satellites are more susceptible to malicious threats (e.g., DDoS energy drain attacks) due to their harsh operating environments and predictable orbital trajectories.

低轨道卫星正从大型、单一且昂贵的通信节点转变为空间中的网络计算系统,即空间计算网络(SCN)。其背后有两个技术趋势:

一是采用商用现成设备(COTS),取代传统低性能但专用的卫星计算部件,如使用树莓派、FPGA、GPU等性能更优且成本更低的组件

二是卫星星座网络,实现高带宽、全球覆盖的网络通信

Starlink星座拥有6000多颗低轨卫星,提供约100 Mbps下行和10 Mbps上行速率;Planet Dove星座则专注于地球观测任务。

SCN改变了两类卫星应用:

一是地球观测,实现轨道内图像处理如压缩、滤波和识别,减少数据传输量,星座联网消除地面站传输等待,实现行业和学术界近实时观测(延迟仅6分钟)

二是卫星骨干全球网络,COTS设备使卫星可作为轨道计算节点,执行视频流处理、路由和移动通信核心功能。卫星网络攻击(如DDoS能耗攻击)与防御问题也日益突出,因卫星操作环境恶劣且轨道可预测

However, SCN applications are hard to design, build and evaluate. We identify two unique characteristics of SCN that distinguish it from ground-based computing and networking: i) satellite COTS chips, which are not originally designed for space environment, face energy and thermal constraints. Overlooking these results in low application performance or even serious harm to the satellite such as overheat or power failure [2, 29, 30, 31]; ii) the dynamic nature of constellation networks (e.g., frequent topology changes [32, 33, 22] and node failures [2, 31, 34, 35]) leads to poor network stability and performance, and should be properly addressed.

Therefore, SCN applications must be designed, built and evaluated with careful consideration of the space environment to ensure proper operation and high performance. However, the high costs of operating or accessing a real constellation (e.g., $150,000 to launch a small satellite [36]) and increasing constellation sizes make it impractical to use real satellites for SCN research. Consequently, SCN research relies on practical experimental platforms that accurately replicate the space environment, i.e., achieving i) satellite-level fidelity, involving heterogeneous COTS devices onboard and environmental context (i.e., energy and thermal) they face; ii) constellation-level fidelity, involving complexity and dynamics of constellation networks, in terms of both topology changes and node failures; iii) high usability, scaling to encompass entire constellations while being cost-effective, flexible, and user-friendly. However, existing tools fall short of meeting all these requirements (§2.3): i) Physical setups [5, 37] offer high fidelity but are difficult to access and lack scalability; ii) Simulators [38, 39, 40, 41, 42, 13] use analytical modeling to simulate the space environment but lack system/network stacks and support for real system functionalities or interactive network traffic [43]; iii) Emulators [44, 45, 43] incorporate constellation network models with virtualized environments (e.g., containers or virtual machines). However, they overlook heterogeneous satellite COTS computing devices facing space environment, especially energy and thermal limitations, and therefore cannot mimic COTS chips’ performance, as our measurement study with real in-space satellites shows.

不过,SCN应用设计构建评估困难。其两大特点区别于地面计算网络:一是卫星COTS芯片非为空间环境设计,受能量和热量限制,忽视会导致性能下降甚至设备损害;二是星座网络动态变化频繁,拓扑波动和节点故障导致网络不稳定需合理处理。

因此,SCN研究需在充分考虑空间环境前提下进行,但高昂的卫星发射与运行成本及日益庞大的星座规模使得使用真实卫星不现实。

现有实验平台需满足卫星级高保真(异构COTS设备及其环境功率、热量仿真)、星座级高保真(网络拓扑及节点动态)、高可用性(可扩展、成本效益、用户友好)三方面要求。

然而,现有平台难以兼顾:

  • 实体设备高保真但难以扩展
  • 仿真软件架构简单缺乏真实系统功能与交互
  • 现有仿真器忽略了COTS设备在空间环境下的计算性能影响

In this paper, we present Rhone 1 , an SCN emulator that meets all the requirements above to support SCN research. The challenges in building Rhone stem from a conflict: As the computation performance of onboard chips is influenced by various factors like operating environment, hardware capacity, and application workload, it requires large system overhead (e.g., CPU and memory) or high cost (e.g., hardware device expenses) to achieve accurate emulation for individual satellites. However, while scaling to a constellation level, the total system overhead or cost for emulating SCN constellations exceeds the requirements for scalability and affordability. To address this, Rhone adopts a twophased emulation process – it builds models based on real data collected from in-orbit satellites and on-board COTS devices, and then scales the models to constellation level via a light-weight container-based architecture with the models installed. i) Offline model building phase: we analyze telemetry data from in-space satellites and onboard COTS chips using (a) an open-source dataset [2] and (b) processed flight data which we collected from the Tiansuan constellation 2 . We have released the processed version of the latter on GitHub 3 , containing over 800000 timestamped entries spanning December 23, 2021 to July 14, 2022. We build power, thermal, orbit, and network models based on such data to replicate the space environment, and the computation model of COTS chips using hardware-in-the-loop chip mirroring; ii) Online emulation phase: we use container-based emulation integrating the models built in the first phase to mirror space environment. We design the satellite COTS aligner to mirror COTS computing devices through profiling and dynamic resource tuning to ensure satellite-level fidelity. We also integrate the satellite network aligner to manage the container network to mirror constellation-level network dynamics. We also provide a set of easy-to-use monitor and command APIs for users to run unmodified applications.

We evaluate the scalability and fidelity of Rhone. Our scalability evaluation shows that Rhone scales to 700 satellites on a single physical machine node, matching the size of Starlink shells, which is comparable to SOTA emulator [43]. In our fidelity evaluation, Rhone shows < 5% error rate in power and computation models, and 1.3−2.5 ◦ C average error in its temperature model, compared to telemetry data we collect from in-orbit satellites as ground truth.

To illustrate how Rhone can advance SCN research, we present two use cases of SCN with Rhone: i) Satellite network energy drain attack. We emulate an energy drain attack [27, 28] in Rhone, where a victim satellite is attacked by malicious heavy traffic load, leading to intensive power usage, rapid battery drain, and degraded network performance. Rhone emulates both satellite-level power consumption and constellation-level network characteristics. ii) Real-time earth observation. We emulate a real-time EO application, comparing four processing strategies (direct transmission, compression, onboard filtering, and onboard inference) in terms of power usage, data transmission volume and end-to-end latency. Rhone emulates both satellite-level in-orbit edge computing for image processing and constellation-level network for data transmission.

本文提出的Rhone应对这一矛盾。由于芯片性能受环境、硬件和负载影响,个体卫星准确仿真需高系统资源或成本,规模放大后无法满足扩展经济性。

Rhone以两阶段仿真流程应对:

  1. 首先分析轨道真实卫星和COTS设备数据,建立 功率、热量、轨道、网络模型及硬件镜像的计算模型
  2. 然后用轻量级容器架构与模型结合,在星座级进行 在线动态仿真

卫星COTS对齐器通过性能轮廓与资源动态调优,确保卫星级一致性;卫星网络对齐器管理容器网络以再现星座级网络动态。提供便捷监控与指令API可运行未修改应用。

评估表明,Rhone单机可扩展至700颗卫星,功率和计算模型误差<5%,温度误差平均1.3-2.5°C,数据以轨道卫星遥测为基准。应用案例涵盖能耗攻击模拟和实时地球观测四种图像处理策略,展示Rhone在电力消耗、数据传输和延迟方面的仿真能力。

Contributions.

• We present Rhone, a full-fledged emulator dedicated for futuristic SCN research(§3);

• We build models of satellites’ operating environment and COTS computing performance by collecting and analyzing real satellite telemetry data (§4), and integrate models with online emulation (§5);

• We conduct experiments to evaluate Rhone’s scalability and fidelity (§7), and demonstrate its ability to advance SCN research by two use cases (§8).

This study does not raise any ethical issue.

我们的贡献:

  • 提出面向未来SCN研究的完整仿真器Rhone;
  • 基于真实遥测数据构建卫星功率、热量、轨道及计算性能模型,并整合至在线仿真;
  • 评估其规模扩展性及仿真准确性,并展示两典型SCN应用场景。

本研究无伦理问题。

Background and Motivation

2.1 Research Challenges and Opportunities

While the rise of SCN offers numerous opportunities for the research community, the unique characteristics of satellite platforms and their operating environments present significant challenges for researchers in exploring the architecture and optimization of emerging applications.

尽管卫星计算网络(Satellite Computing Networks, SCN)的兴起为研究界带来了众多机遇,但卫星平台及其运行环境的独特性,为研究人员在探索新兴应用的架构和优化方面带来了巨大挑战。

Energy and thermal dynamics and constraints. Satellites in space operate under harsh environmental conditions, leading to significant performance degradation compared to those on the ground. Regarding energy, as both energy harvesting capability is influenced by sunlit regions and onboard battery storage is limited, real satellite measurements [2] indicate that the instability in energy supply and limitations on battery discharge depth affect the power available to onboard computing devices, thereby influencing their availability and efficiency. In terms of thermal management, as satellites generally lack active cooling mechanisms to reduce costs, chips can automatically reduce their clock speeds due to thermal throttling. Short-term overheating can reduce the satellite’s computing throughput and prolong task execution latency. Long-term exposure to high temperatures can diminish the reliability and lifespan of COTS devices. As a result, there are opportunities for SCN researchers to consider energy and thermal dynamics and constraints when designing and optimizing SCN applications (e.g., allocate more tasks in sunlit regions and fewer in shadows).

能源与热动态及约束

空间中的卫星在恶劣的环境条件下运行,导致其性能相较于地面设备显著下降。在能源方面,由于能量收集能力受光照区域影响且星上电池存储有限,真实的卫星测量数据 [2] 表明,能源供应的不稳定性及电池放电深度的限制会影响星上计算设备的可用功率,从而影响其可用性和效率。

在热管理方面,由于卫星为降低成本通常缺乏主动冷却机制,芯片会因热节流(thermal throttling)而自动降低其时钟频率:

  • 短期过热会降低卫星的计算吞吐量并延长任务执行延迟
  • 长期暴露在高温下会降低商用现成(COTS)设备的可靠性和寿命

因此,SCN研究人员在设计和优化SCN应用时,有机会考虑能源和热动态及约束(例如,在光照区分配更多任务,在阴影区分配更少任务)。

Constellation network dynamics. Different from terrestrial networks, constellation networks in space are highly dynamic. These dynamics manifest in two ways: i) Topology dynamics. The high velocities of satellites relative to each other and to fixed ground stations lead to highly dynamic connectivity and frequent topology changes. As a result, SCN researchers must develop more robust routing schemes [46], topology design [47, 48] and satellite-ground collaboration strategies [19]; ii) Node failure. COTS devices in space are more vulnerable than those on Earth. In addition to cosmic radiation [34], their performance can also be affected by energy and thermal issues, as mentioned above. Recent studies [27, 28] show that malicious network traffic can trigger energy drain attacks on satellites, leading to node failures. Consequently, SCN researchers must address these node failures and their impact on constellation networks.

星座网络动态性

与地面网络不同,空间中的星座网络具有高度动态性。这种动态性体现在两个方面:

i) 拓扑动态性

卫星之间以及卫星与地面固定站之间的高速相对运动,导致了连接关系的高度动态和拓扑的频繁变化。因此,SCN研究人员必须开发更鲁棒的路由方案 [46]、拓扑设计 [47, 48] 和星地协同策略 [19]

ii) 节点故障

空间中的COTS设备比地面上的更脆弱。除了宇宙辐射 [34],其性能还会受到前述能源和热问题的影响。近期研究 [27, 28] 表明,恶意网络流量可能引发对卫星的能量耗尽攻击,从而导致节点故障。因此,SCN研究人员必须解决这些节点故障及其对星座网络的影响

2.2 SCN Emulator Essential Elements

The complex and dynamic operating environment of SCN, combined with the extremely high costs associated with developing a mega-constellation in space, underscores the urgent need for a comprehensive SCN emulator for the research community to optimize constellation-level applications that integrate computing and networking on real satellites. Based on the preceding discussion, a comprehensive emulator should meet the following requirements to mimic the unique characteristics of SCN:

Satellite-level fidelity. A comprehensive emulator should accurately model the operating environment, computational capability, and application performance of individual satellites. This modeling must encompass both the environment and the devices: i) Contextual environmental awareness. The emulator should account for the effects of real satellite operational contexts on applications. For example, energy availability and thermal control should influence computing capacity and efficiency, affecting end-to-end latency in realtime EO applications. ii) Replications of realistic heterogeneous computing devices. The emulator should precisely replicate the computation, storage, and communication capacities of satellite COTS hardware, such as Nvidia’s TX1 and TX2 (CPU+GPU) [49], Xilinx’s Zynq (CPU+FPGA) [3], and Huawei’s Atlas 200DK (CPU+NPU) [2].

Consellation-level fidelity. To emulate constellation networking, the emulator must replicate the complexity and dynamics of entire constellations. This includes the impact of topology changes, satellite-ground communication handoffs, ISL bandwidth, and node failures on network performance.

High usability. The emulator must be highly usable in three aspects: i) scalability to support simulations of extensive satellite constellations containing thousands of satellites while maintaining accurate modeling of each satellite’s performance and inter-satellite networking; ii) flexibility to run real applications and network stack provided by users; iii) accessibility with easy-to-use interfaces and low cost.

SCN复杂且动态的运行环境,加上在太空中开发巨型星座的极高成本,凸显了研究界对于一个综合性SCN仿真器的迫切需求,以便在真实卫星上优化集成了计算与网络的星座级应用。基于前述讨论,一个综合性仿真器应满足以下要求,以模拟SCN的独特性:

(1) 卫星级保真度

一个综合性仿真器应能精确建模单个卫星的运行环境、计算能力和应用性能。这种建模必须同时涵盖环境和设备两个方面:

i) 上下文环境感知: 仿真器应能考虑真实卫星运行上下文对应用的影响。例如,能源可用性和热控制应能影响计算能力和效率,进而影响实时对地观测(EO)应用的端到端延迟

ii) 真实异构计算设备的复现: 仿真器应能精确复现卫星COTS硬件的计算、存储和通信能力,例如Nvidia的TX1和TX2 (CPU+GPU) [49]、Xilinx的Zynq (CPU+FPGA) [3] 以及华为的Atlas 200DK (CPU+NPU) [2]

(2) 星座级保真度

为模拟星座网络,仿真器必须复现整个星座的复杂性和动态性。这包括拓扑变化、星地通信切换、星间链路(ISL)带宽和节点故障对网络性能的影响。

(3) 高可用性

仿真器必须在三个方面具有高可用性:

i) 可扩展性:支持包含数千颗卫星的大规模星座仿真,同时保持对每颗卫星性能和星间网络的精确建模

ii) 灵活性:能够运行用户提供的真实应用程序和网络协议栈

iii) 可及性:具备易于使用的接口且成本低廉

2.3 Limitations of Existing SCN Emulators

Although many simulators/emulators have been proposed in recent years [38, 39, 40, 41, 42, 13, 44, 45, 43], as shown in Tab. 1, they fall short of meeting all these requirements, particularly in accurately modeling the operational context’s impact on networking and computing performance.

Physical setups. Commercial satellite constellations [5] and open-sourced experimental satellite platforms [37] with physical setups offer the highest fidelity for both individual satellites and constellations. However, commercial platforms do not provide researchers with experimental interfaces. Open-source platforms allow some experimentation, but lack support for freely running user-defined applications and face scalability limitations tied to constellation size.

Simulators. To address the issues of scalability, flexibility, and accessibility, the research community has turned to the development of analytical simulators [38, 39, 40, 41, 42, 13]. These simulators rely on theoretical modeling without real software and network stacks, limiting their ability to simulate the actual behavior of computing or networking devices. Additionally, most of them also do not allow users to run custom applications or algorithms.

Emulators. This widening gap highlights the growing need for emulators [44, 45, 43] that incorporate virtual environments (e.g., containers or virtual machines), thus enabling real system/network stacks to maintain system fidelity. However, due to the lack of real in-orbit data from COTS devices on satellites, even the current state-of-the-art (SOTA) solutions struggle to accurately replicate the actual environmental context, including energy and thermal influences, as well as the characteristics of onboard computing devices.

尽管近年来已提出了许多模拟器(simulators)/仿真器(emulators)[38, 39, 40, 41, 42, 13, 44, 45, 43],如表1所示,但它们仍未能满足所有这些要求,特别是在精确建模运行上下文对网络和计算性能的影响方面。

积累一下:

alt text

物理平台

带有物理设备的商业卫星星座 [5] 和开源实验卫星平台 [37] 为单个卫星和星座提供了最高的保真度。然而,商业平台不为研究人员提供实验接口。开源平台允许进行一些实验,但缺乏对自由运行用户自定义应用的支持,并面临与星座规模相关的可扩展性限制。

模拟器 (Simulators)

为解决可扩展性、灵活性和可及性问题,研究界转向开发分析性模拟器 [38, 39, 40, 41, 42, 13]。这些模拟器依赖于理论建模,没有真实的软件和网络协议栈,这限制了它们模拟计算或网络设备实际行为的能力。此外,它们中的大多数也不允许用户运行自定义的应用程序或算法。

仿真器 (Emulators)

这一日益扩大的差距凸显了对仿真器 [44, 45, 43] 日益增长的需求,这些仿真器集成了虚拟环境(例如,容器或虚拟机),从而能够运行真实的系统/网络协议栈以保持系统保真度。然而,由于缺乏来自卫星上COTS设备的真实在轨数据,即使是当前最先进的(SOTA)解决方案也难以精确复现实际的环境上下文,包括能源和热影响,以及星上计算设备的特性。


Motivating examples: low emulation fidelity. We analyze the power, thermal, and computing capacity dynamics of COTS devices onboard with the open-source dataset from an in-space satellite BUPT-1 [2] running DNN inference tasks on a Raspberry Pi. The dataset includes telemetry data of power, temperature, and onboard tasks. With this ground-truth data, we highlight two motivating examples that illustrate the limitations of existing SOTA emulators in achieving high-fidelity emulation of individual satellites:

启发性案例:低仿真保真度

我们利用来自在轨卫星“北邮一号”(BUPT-1)[2] 的开源数据集,分析了其星上COTS设备的功率、热和计算能力动态,该卫星在树莓派(Raspberry Pi)上运行DNN推理任务。数据集包括功率、温度和星上任务的遥测数据。

基于这些地面真实数据(ground-truth data),我们强调了两个启发性案例,这两个案例说明了现有SOTA仿真器在实现单个卫星高保真度仿真方面的局限性:

i) Inaccurate power model. Cote [13], the SOTA satellite computing simulator, provides a power model for individual satellites. We compare it with the actual satellite harvested power trace, as shown in Fig. 1a. The power model in Cote assumes the harvested power in the sunlit region to be constant. In reality, the harvested power varies with the satellite’s position, attitude, placement of solar panels, and the satellite’s orientation. An inaccurate power model fails to simulate the effects of power fluctuations on computing or networking devices, such as reduced availability during power shortages, which Cote does not accurately represent.

ii) Inaccurate computation capacity. StarryNet [43], the SOTA satellite networks emulator, uses Docker containers to emulate computing devices on individual satellites. We replicate the onboard DNN inference experiments within a container of StarryNet and compare the inference latency with those obtained from Raspberry Pi on BUPT-1. As shown in Fig. 1b, the execution latency of an application in its containers deviates significantly from that of an in-orbit COTS device. The YOLOv3 and YOLOv5 execution latency fluctuations of onboard chips are 330% and 160% larger than emulated, respectively. The application’s average latency in the container differs by up to 15% compared to the real in-orbit COTS devices. Moreover, StarryNet fails to simulate the increase in inference latency caused by chip throttling due to overheating (initial 25 min in Fig. 1b). These inaccuracies arise because StarryNet simulates devices’ capabilities by adjusting CPU core counts and frequencies. However, this method does not capture the inherent characteristics of COTS devices, which are often heterogeneous (e.g., Jetson TX2’s CPU + GPU). Additionally, COTS devices and containers operate on different computing architectures, leading to different performance.

alt text

i) 不准确的功率模型

Cote [13],作为SOTA的卫星计算模拟器,为单个卫星提供了一个功率模型。我们将其与真实的卫星能量收集轨迹进行比较,如图1a所示。

Cote中的功率模型假设在光照区的收集功率是恒定的。 实际上,收集到的功率随卫星的位置、姿态、太阳能电池板的布局以及卫星的朝向而变化。 一个不准确的功率模型无法模拟功率波动对计算或网络设备的影响,例如在电力短缺期间可用性的降低,而Cote未能准确表示这一点。

ii) 不准确的计算能力

StarryNet [43],作为SOTA的卫星网络仿真器,使用Docker容器来模拟单个卫星上的计算设备。我们在StarryNet的容器内复现了星上DNN推理实验,并将其推理延迟与从“北邮一号”上的树莓派获得的数据进行比较。

如图1b所示,应用程序在其容器中的执行延迟与在轨COTS设备的执行延迟存在显著差异。星上芯片运行YOLOv3和YOLOv5的执行延迟波动分别比仿真结果大330%和160%。容器中应用的平均延迟与真实在轨COTS设备相比,差异高达15%。

此外,StarryNet未能模拟出因过热导致芯片降频而引起的推理延迟增加(如图1b中最初的25分钟所示)。

这些不准确性源于StarryNet通过调整CPU核心数和频率来模拟设备能力。然而,这种方法无法捕捉COTS设备固有的异构特性(例如,Jetson TX2的CPU + GPU)。此外,COTS设备和容器在不同的计算架构上运行,导致了性能差异。

Rhone Overview

We present Rhone, a comprehensive SCN emulator meeting all requirements discussed in §2.2. As shown in Fig. 2, the emulation process of Rhone consists of two phases: Offline Model Building (§4) and Online Emulation (§5). In the model building phase, Rhone initially constructs power, thermal, computation, orbit, and network models offline using telemetry and experimental data collected from real satellites and onboard COTS devices. These models replicate the complex and dynamic SCN environment, effectively mirroring the behavior of onboard chips. In the emulation phase, Rhone deploys these pre-built models into software containers to emulate each satellite node and their interactions. This two-phased design ensures high-fidelity emulation of individual satellites while scaling to the constellation level at a low cost. Users can interact with the system via monitoring and control APIs to run custom applications.

我们提出了Rhone,一个满足2.2节所述全部要求的综合性SCN仿真器。

如图2所示,Rhone的仿真过程包括两个阶段:离线模型构建(第4节)和在线仿真(第5节)

alt text

(1) 模型构建阶段:

Rhone首先利用从真实卫星和星上COTS设备收集的遥测与实验数据,离线构建功率、热、计算、轨道和网络模型。

这些模型复现了复杂且动态的SCN环境,从而有效地模拟了星上芯片的行为。

(2) 仿真阶段:

Rhone将这些预构建的模型部署到软件容器中,以仿真每个卫星节点及其相互作用。

(3) TL;DR:

这种两阶段设计在确保对单个卫星进行高保真度仿真的同时,能够以低成本将仿真规模扩展至星座级别。用户可以通过监控与控制API与系统交互,以运行自定义应用程序。

Online Emulation

During the online emulation phase, Rhone leverages the models built in the offline phase (§4) to manage system resources and replicate the SCN environment.

在 "在线仿真" 阶段, Rhone利用在前一节 "离线阶段" 建立的模型, 管理系统资源以及复现SCN环境

5.1 Emulated Satellite Nodes

In Rhone, each satellite node is represented by a state manager and a set of Docker containers each representing a chip on the satellite, either representing a COTS computing chip.

State manager. The state manager is responsible for managing the satellite’s emulation states, serving as the central point for interactions. A context simulation agent interfaces with pre-established models (§4) that capture the satellite’s physical and operational characteristics. It continuously fetches the generated traces including energy harvesting and consuming, temperature condition variation, network connectivity and properties from the orbit and network models, and COTS hardware performance data from the computation model from the data store. Rhone coordinates nested interaction of the models iteratively (Fig. 9). For example, the power model reads current chip temperature \(T_{hottest}\) , and outputs the chip power value, which is then used by the thermal model to derive \(T_{hottest}\) , the new chip temperature value of next time step. Additionally, the agent handles incoming commands and monitoring requests through user API, executing commands to alter the satellite’s states and providing real-time status updates as needed. By maintaining an accurate representation of the satellite’s current state, the state manager relays critical information to the COTS chip emulation containers, ensuring that the emulation remains consistent with the satellite’s operational context.

COTS chip emulation containers. These containers are designed for emulating COTS computing hardware typically equipped on modern satellites, such as Raspberry Pi [2], Jetson [66, 14], or NPUs [2]. Each container reflects the performance of its corresponding chip under various conditions by operating with computing resources dynamically adjusted by the satellite COTS aligner (§5.2) based on the current physical state. For instance, an ARM CPU is represented by a CPU container, while a GPU container models hardware like NVIDIA Jetson GPU or Huawei Atlas NPU. This structure, using containers with dynamic computation resources instead of hardware-based emulation, offers flexibility and scalability with low cost. Additionally, the accuracy is ensured through precise modeling of chip performance and careful handling of resource constraints.

在Rhone中,每个卫星节点由一个 状态管理器一组Docker容器 表示,每个容器代表卫星上的一个COTS计算芯片。

状态管理器: 状态管理器负责管理卫星的仿真状态,作为交互的中心点。

一个上下文模拟代理与预先建立的模型(第4节)交互,这些模型捕捉了卫星的物理和运行特性。它从数据存储中持续获取生成的轨迹数据,包括从轨道和网络模型中获取的能量收集与消耗、温度条件变化、网络连接性与属性,以及从计算模型中获取的COTS硬件性能数据。Rhone以迭代方式协调模型的嵌套交互(图9)。

alt text

例如,功率模型读取当前芯片最高温度 \(T_{hottest}\),并输出芯片功率值,热模型随后使用该值来推导下一时间步的新芯片温度值 \(T_{hottest}\)。此外,该代理通过用户API处理传入的命令和监控请求,执行命令以改变卫星状态,并根据需要提供实时状态更新。通过维持对卫星当前状态的精确表示,状态管理器将关键信息传递给COTS芯片仿真容器,确保仿真与卫星的运行上下文保持一致。

COTS芯片仿真容器: 这些容器旨在仿真现代卫星上通常配备的COTS计算硬件,如树莓派 [2]、Jetson [66, 14] 或 NPU [2]。

每个容器通过运行由卫星COTS对齐器(§5.2)根据当前物理状态动态调整的计算资源,来反映其对应芯片在各种条件下的性能。例如,一个ARM CPU由一个CPU容器表示,而一个GPU容器则模拟像NVIDIA Jetson GPU或华为Atlas NPU这样的硬件。这种使用动态计算资源的容器而非基于硬件的仿真结构,以低成本提供了灵活性和可扩展性。此外,通过对芯片性能的精确建模和对资源约束的谨慎处理,确保了仿真的准确性。

5.2 Satellite COTS Aligner

We design the satellite COTS aligner (SCA) to dynamically Inter-/Intra- Model-Based align the performance of emulated COTS hardware under emulated satellite conditions, based on offline modeling of Satellite real satellite COTS chips (§4.3), and current temperature given by the thermal model (§4.2). It adjusts resource allocations in real-time to ensure the emulated environment accurately reflects the satellite hardware’s behavior.

Resource allocation. Upon invocation, SCA uses the outputs from pre-built models to map the current chip temperature to application execution latency. As illustrated in Fig. 2, SCA first retrieves the current chip temperature, calculated by the thermal model (e.g., 15 ◦ C). It then finds the corresponding execution latency in the satPerfTable, as determined by the computation model (e.g., 3.0 s). Using this expected latency, SCA queries the emuPerfTable to determine the required resource allocation (e.g., 90%) in the container, ensuring accurate performance replication.

Dynamic resource tuning. With the determined resource allocation scheme, SCA adjusts the container’s CPU/GPU share to match the satellite’s expected performance. This adjustment is based on the assumption that the computational power of satellite COTS hardware is weaker than that of ground servers (e.g., the satellite’s CPU is weaker than a server’s CPU, and its NPU/GPU is weaker than a server’s GPU). Therefore, we can slow down the ground server’s CPU and GPU by limiting their resources, bringing their performance closer to that of the satellite’s COTS hardware.

For CPU resource tuning, SCA works by adding the process to a certain cgroup, and periodically adjusting the cgroup’s configured CPU share value, which artificially and accurately controls latency. For GPU adjustment, SCA dynamically manages and limits GPU performance by intercepting CUDA driver API calls and monitoring activity. Rhone initializes a rate limit manager that reads GPU performance thresholds from a configuration file. During each CUDA call, a rate limiter checks GPU usage against these thresholds and applies delays if the usage exceeds the limit.

我们设计了卫星COTS对齐器 (SCA),以根据离线建模的真实卫星COTS芯片(§4.3)和热模型(§4.2)给出的当前温度,在仿真的卫星条件下,动态地对仿真COTS硬件的性能进行基于模型间/模型内的对齐。它实时调整资源分配,以确保仿真环境能准确反映卫星硬件的行为。

资源分配

在被调用时,SCA使用预构建模型的输出来将当前芯片温度映射到应用程序的执行延迟。如图2所示, SCA首先检索由热模型计算出的当前芯片温度(例如,15℃)。然后,它在satPerfTable(卫星性能表)中查找由计算模型确定的相应执行延迟(例如,3.0秒)。利用这个预期延迟,SCA查询emuPerfTable(仿真性能表)以确定容器中所需的资源分配(例如,90%),从而确保精确的性能复现。

动态资源调优

根据确定的资源分配方案,SCA调整容器的CPU/GPU份额以匹配卫星的预期性能。这种调整基于一个假设:卫星COTS硬件的计算能力弱于地面服务器(例如,卫星的CPU弱于服务器的CPU,其NPU/GPU也弱于服务器的GPU)。因此,我们可以通过限制地面服务器的CPU和GPU资源来降低其运行速度,使其性能接近卫星COTS硬件的水平。

  • 对于CPU资源调优:

    • SCA通过将进程添加到一个特定的cgroup中,并周期性地调整cgroup配置的CPU份额值来实现,从而人为地精确控制延迟
  • 对于GPU调整:

    • SCA通过拦截CUDA驱动程序API调用和监控活动来动态管理和限制GPU性能
    • Rhone初始化一个速率限制管理器,该管理器从配置文件中读取GPU性能阈值
    • 在每次CUDA调用期间,一个速率限制器会检查GPU使用情况是否超过这些阈值,并在超过限制时施加延迟

5.3 Satellite Network Aligner

Rhone utilizes Docker networks to emulate complex networking scenarios, including variable latency, fluctuating bandwidth, and dynamic network topologies. The Satellite Network Aligner (SNA) inherits StarryNet [43] to manage the docker network, aligning its configuration and thus performance to the orbit and network model (§4.4). Moreover, we introduce two features beyond StarryNet, enabling a more accurate and realistic emulation of SCNs.

Granular satellite node representation. In StarryNet, each satellite node is represented by a single container. This oversimplifies the satellite’s structure, beacuse a satellite typically consists of multiple COTS devices communicating with each other. In Rhone, we represent each satellite node with multiple containers (§5.1), each representing a COTS computing device, thus enabling granular network emulation involving internal communications between the multiple SoCs within a satellite. Rhone also allows users to designate a certain container to represent the on-board computer (OBC), which sends/receives data to/from outside the satellite, and communicates with multiple COTS chips.

Model-based node failure. Additionally, StarryNet uses a random approach to emulate node and link failures. In Rhone, the SNA take a more controlled approach by emulating node failures, shutdowns, or other behaviors based on detailed power and thermal models (§4.2). Rather than randomly disrupting inter-satellite communications, Rhone accurately emulates the consequences of power shortages, overheating (e.g., shutdown the COTS devices at low battery level or high temperature), and other operational challenges, resulting in a more realistic and reliable network emulation.

Rhone利用Docker网络来模拟复杂的网络场景,包括可变延迟、波动的带宽和动态的网络拓扑。

卫星网络对齐器 (SNA) 继承了StarryNet [43] 来管理Docker网络,使其配置和性能与轨道和网络模型(§4.4)保持一致。此外,我们引入了两个超越StarryNet的功能,从而实现了对SCN更准确、更真实的仿真。

(1) 粒度化的卫星节点表示

在StarryNet中,每个卫星节点由单个容器表示。这过度简化了卫星的结构,因为一颗卫星通常由多个相互通信的COTS设备组成。在Rhone中,我们用多个容器来表示每个卫星节点(§5.1),每个容器代表一个COTS计算设备,从而实现了涉及卫星内部多个SoC之间通信的粒度化网络仿真。Rhone还允许用户指定某个容器代表星上计算机(OBC),该计算机负责与卫星外部收发数据,并与多个COTS芯片通信。

(2) 基于模型的节点故障

此外,StarryNet使用随机方法来模拟节点和链路故障。在Rhone中,SNA采用一种更可控的方法,它根据详细的功率和热模型(§4.2)来模拟节点故障、关机或其他行为。Rhone不是随机中断星间通信,而是精确地模拟了电力短缺、过热(例如,在低电量或高温时关闭COTS设备)以及其他运行挑战所带来的后果,从而实现了更真实、更可靠的网络仿真。

5.4 User Interfaces

Rhone provides two types of runtime APIs for user interaction: Monitor APIs and Command APIs.

Monitor APIs allow users to track the real-time status of satellite nodes and networks. This includes monitoring physical states such as temperature and energy levels, the load on COTS computing hardware, and the status of satellite network links (e.g., routing tables and connection states).

Command APIs offer users the capability to dynamically modify the state of the satellite nodes and the constellation. Through these, users can change operational modes, modify the connectivity of inter-satellite links or satellite-to-ground links, and initiate specific tasks on a satellite, such as running an application on its onboard COTS hardware.

Rhone提供两种类型的运行时API供用户交互: 监控API命令API

监控API 允许用户跟踪卫星节点和网络的实时状态。这包括监控温度和能量水平等物理状态、COTS计算硬件的负载,以及卫星网络链路的状态(例如,路由表和连接状态)

命令API 为用户提供了动态修改卫星节点和星座状态的能力。通过这些API,用户可以更改运行模式,修改星间链路或星地链路的连接性,并在卫星上启动特定任务,例如在其星上COTS硬件上运行应用程序

Discussion

Model extendability. While some models in R HONE are built based on telemetry and experimental data from two COTS devices, R HONE allows users to replace or extend models as needed. To expand existing power or thermal models, users can apply the modeling methods provided by R HONE to telemetry data from other satellites or determine model parameters accordingly. Users can also use the hardware-in-the-loop chip mirroring method provided in the computation model to incorporate their own COTS devices.

模型可扩展性

尽管RHONE中的一些模型是基于两款COTS设备的遥测和实验数据构建的,但RHONE允许用户根据需要替换或扩展模型。要扩展现有的功率或热模型,用户可以将RHONE提供的建模方法应用于其他卫星的遥测数据,或据此确定模型参数。用户还可以使用计算模型中提供的硬件在环芯片镜像方法,来集成他们自己的COTS设备。

Other environmental context effects on SCN. Currently, R HONE constructs the energy and thermal context for SCN. While additional system effects on COTS devices – such as radiation failures [34], collision avoidance [78], and physical-layer challenges and security concerns [79] – can also impact SCN performance, these are left for future work.

其他环境上下文对SCN的影响

目前,RHONE为SCN构建了能源和热上下文。尽管对COTS设备的其他系统性影响 —— 例如辐射故障[34]、碰撞规避[78]以及物理层挑战和安全问题[79] —— 也会影响SCN的性能,但这些将留待未来工作解决。

In this section, we briefly introduce SCN-related research. The network community is currently focusing on system architecture [80], topology design [81], routing [82, 83], transport-layer congestion control [84], security issues [28], real-world measurements [85], traffic scheduling [72], and on-the-ground emulators [38, 39, 40, 41, 66, 45, 43] for emerging satellite networks. The mobile computing community focuses on verifying the feasibility of satellite image processing applications [86, 87, 88, 89] and on task allocation and resource scheduling algorithms and system design [73, 45, 14, 74, 75, 76]. As satellite networking and computing converge, recent studies investigate SCN applications including real-time EO systems [15], cross-continent satellite video conferencing systems [21], and satellite in-network computing [22, 23]. For example, Li et al. [22] explore the feasibility of pushing mobile core network functions to LEO satellite mega-constellations to avoid signaling storm and improve network efficiency. Xing et al. [2] conduct a comprehensive in-orbit measurement of satellite COTS computing systems, revealing that energy and thermal contexts in real satellite environments impact satellite behavior and application performance. R HONE serves as a comprehensive emulator accounting for environmental impacts on both computing and networking performance of onboard chips, on which the SCN-related research can be conducted.

在本节中,我们简要介绍与SCN相关的研究。

网络研究界目前关注新兴卫星网络的系统架构[80]、拓扑设计[81]、路由[82, 83]、传输层拥塞控制[84]、安全问题[28]、真实世界测量[85]、流量调度[72]以及地面仿真器[38, 39, 40, 41, 66, 45, 43]。

移动计算研究界则专注于验证卫星图像处理应用的可行性[86, 87, 88, 89],以及任务分配、资源调度算法和系统设计[73, 45, 14, 74, 75, 76]。

随着卫星网络与计算的融合,近期的研究开始探索SCN应用,包括实时对地观测(EO)系统[15]、跨大陆卫星视频会议系统[21]以及卫星在网计算[22, 23]。例如,Li等人[22]探索了将移动核心网功能推送到低轨(LEO)巨型卫星星座的可行性,以避免信令风暴并提高网络效率。

Xing等人[2]对卫星COTS计算系统进行了全面的在轨测量,揭示了真实卫星环境中的能源和热上下文如何影响卫星行为和应用性能。

RHONE作为一个综合性仿真器,能够解释环境对星上芯片计算和网络性能的影响,基于它,可以开展与SCN相关的研究。

Conclusion

The growth of SCNs introduces unique characteristics and challenges, which should be carefully considered in designing, building and evaluating SCN applications, demanding accurate, scalable and easy-to-use developing and evaluating tools. In this paper, we introduce R HONE, a full-fledged emulator for SCNs with high satellite- and constellation-level fidelity, high scalability and high usability. It builds models from real in-orbit satellite telemetries, and coordinates the emulation via a Docker container-based environment. Crucially, R HONE provides users with satellite and constellation states (e.g., power, temperature and constellation topology) in real time, and dynamically aligns emulated computing and networking performance with real satellite COTS devices. It enables researchers on Earth to reliably develop, evaluate and optimize SCN applications like satellite earth observation or satellite-backboned global networking without having to access a real in-orbit satellite. We evaluate R HONE ’s fidelity using real in-orbit satellite telemetry data, and demonstrate its effectiveness in emulating typical SCN applications as use cases. We believe RHONE will help advance futuristic SCN research, fostering the development of more resilient and efficient space-based systems.

SCN的发展带来了独特的特性和挑战,在设计、构建和评估SCN应用时应予以仔细考虑,这迫切需要准确、可扩展且易于使用的开发与评估工具。

在本文中,我们介绍了RHONE,一个功能完备的SCN仿真器,具备高卫星级和星座级保真度、高可扩展性和高可用性。

它利用真实的在轨卫星遥测数据构建模型,并通过一个基于Docker容器的环境来协调仿真。至关重要的是,RHONE能实时为用户提供卫星和星座的状态(例如,功率、温度和星座拓扑),并动态地将仿真的计算和网络性能与真实的卫星COTS设备对齐。它使地球上的研究人员能够在无需访问真正在轨卫星的情况下,可靠地开发、评估和优化如卫星对地观测或以卫星为骨干的全球网络等SCN应用。

我们使用真实的在轨卫星遥测数据评估了RHONE的保真度,并通过用例展示了其在模拟典型SCN应用方面的有效性。我们相信,RHONE将有助于推动未来的SCN研究,促进更具韧性且更高效的天基系统的发展。