Performance Evaluation¶

In this section, we implement an SEC testbed and evaluate the performance of PHOENIX to illustrate its effectiveness. First, we compare the DoD of satellite batteries and task completion time with the other state-of-the-art strategies. Second, we explore the performance under four seasons as the revolution of the earth can affect the sunlit ratio defined in §III-A. Third, we apply the strategies to different constellations to explore the impact of constellation parameters. Finally, we setup various workloads to illustrate the robustness of PHOENIX via tuning the processing capability and task type.

在本节中，我们实现了一个 SEC 测试平台，并对 PHOENIX 进行性能评估，以验证其有效性。首先，我们将 PHOENIX 在卫星电池放电深度（DoD）和任务完成时间方面的表现，与其他最新策略进行对比。其次，我们评估其在四季环境下的性能表现，因为地球公转会影响 §III-A 中定义的受光比例。第三，我们将这些策略应用于不同的星座配置，以探究星座参数的影响。最后，我们通过调节处理能力和任务类型，构建不同的工作负载场景，以验证 PHOENIX 的稳健性。

A. Environment Setup¶

Prototype implementation. We build a data-driven hardwarein-the-loop SEC testbed based on StarryNet [28], a recent container-based satellite network emulator. The SEC environment is deployed on a Dell Precision 7920 Tower Workstation connected with a Jetson AGX Orin Developer Kit as shown in Fig. 6. The Developer Kit works as an SEC node, running machine learning models to evaluate the task completion time and energy consumption. Based on the +Grid [26] structure and the trajectory of the satellite simulated by the Developer Kit, it establishes virtual links with its adjacent satellites and ground facilities dynamically.

原型实现。我们基于 StarryNet [28]（一种近期提出的基于容器的卫星网络仿真器）构建了一个数据驱动的硬件在环 SEC 测试平台。SEC 环境部署在一台 Dell Precision 7920 Tower 工作站上，并连接一台 Jetson AGX Orin 开发套件，如图 6 所示。该开发套件作为 SEC 节点，运行机器学习模型以评估任务完成时间和能耗。基于 +Grid [26] 结构以及开发套件模拟的卫星轨迹，它能够动态地与相邻卫星及地面设施建立虚拟链路。

LEO constellation settings. We conduct extensive simulation driven by real-world information, including two kinds of LEO constellations: inclined orbit constellation (Starlink [29]) and polar constellation (OneWeb [30]). Following [12], we use the ground station locations collected by SatNOGS [31]. For computation, we adjust the power level of Jetson AGX Orin Developer Kit to 30W/50W/60W, providing different computation capabilities. Based on the existing hardwares [32], [33], we set the power of GSL/ISL to 16W/10W respectively. Following [7], we set the basic power to 4W. And we set the power generated by solar panels to 120W and the battery volume to 60Wh, which can offer sufficient energy. For transmission, we set the capacity of GSL/ISL to 100Mbps/1Gbps respectively.

LEO 星座配置。我们基于真实数据进行大规模仿真，包括两种低轨星座：倾斜轨道星座（Starlink [29]）与极轨星座（OneWeb [30]）。按照 [12]，我们使用 SatNOGS [31] 收集的地面站位置。计算能力方面，我们将 Jetson AGX Orin 开发套件的功率调节为 30W/50W/60W，以提供不同计算能力。根据现有硬件参数 [32]，[33]，我们将 GSL/ISL 的功率设为 16W/10W，基础功率设为 4W [7]，太阳能板发电功率设为 120W，电池容量为 60Wh，以确保充足能源。传输能力方面，GSL/ISL 链路容量分别设为 100Mbps / 1Gbps。

SEC tasks and datasets. We select ship detection [1], [19] and wildfire segmentation [34] as the SEC tasks. For ship detection, we apply YOLO [35] to dataset [36] and select Atlantic Ocean as the RoI. Satellites continuously capture images every second, with the resolution of 10K × 10K pixels [8]. Each image is expected to be processed within 5 minutes [37]. Based on our measurement, the processing time of an image for ship detection is 10s/5s/3s under 30W/50W/60W respectively. For wildfire segmentation, we apply U-Net [38] to dataset [39] and select Amazon Rainforest as the RoI. The imaging interval is set to 5 seconds and the processing time of an image is 120s/67s/51s under 30W/50W/60W respectively. Other parameters are the same as ship detection task.

SEC 任务与数据集。我们选择船舶检测 [1]，[19] 和野火分割 [34] 作为 SEC 任务。船舶检测任务使用 YOLO [35] 模型与数据集 [36]，并选择大西洋作为感兴趣区域（RoI）。卫星每秒采集一张分辨率为 10K × 10K 像素的图像 [8]，每张图像需在 5 分钟内处理完成 [37]。实测结果表明，在 30W/50W/60W 功率下，船舶检测单张图像处理时间分别为 10s/5s/3s。野火分割任务使用 U-Net [38] 模型与数据集 [39]，选择亚马逊雨林作为 RoI，成像间隔为 5 秒，在 30W/50W/60W 下，单张图像处理时间分别为 120s/67s/51s。其他参数与船舶检测任务相同。

Comparison objects and metrics. We implement three state-of-the-art offloading schemes for comparison: (i) OEC [7], which processes the tasks with an intra-orbit pipeline; (ii) MHSPO [9], an energy-efficient satellite peer offloading scheme, and (iii) L2D2 [12], which offloads tasks to geo-distributed ground stations. We regard L2D2 as the baseline because it doesn’t consume any computation energy, and thus has the lowest energy consumption. We use DoD, battery lifetime and task finish time as metrics to present the effectiveness of PHOENIX.

对比对象与评估指标。我们实现了三种最新的卸载策略作为对比：(i) OEC [7]：基于轨道内流水线处理任务；(ii) MHSPO [9]：一种节能型卫星对等卸载方案；(iii) L2D2 [12]：将任务卸载到地理分布的地面站。由于 L2D2 不消耗任何计算能量，因此其能耗最低，被视为基线方案。我们采用 DoD、电池寿命及任务完成时间作为评估指标，以验证 PHOENIX 的有效性。

B. DoD and Task Completion Time Comparison¶

We first compare the performance under the configuration of Starlink constellation, using 60W power level for ship detection. As shown in Fig. 7a, PHOENIX is close to L2D2 and reduces the maximum DoD by 54.8% as compared with OEC and MHSPO. Note that there are some satellites with 0% DoD because these satellites can keep illuminated by sun without consuming the battery energy. As tasks can be processed in ground stations, sunlit satellites or shadowed satellites, Fig. 7b plots the percentage of scheduling decisions (PSD), which indicates the ratio of three types of nodes selected to process tasks. We can see that PHOENIX outperforms other on-board computation strategies from two aspects: (i) PHOENIX can cooperatively exploit the communication capability of ground stations and on-board computation capability of satellites; (ii) PHOENIX can exploit the sunlit satellites to process tasks (99.1% tasks are processed in sunlit satellites) while reducing the consumption of shadowed satellites to save the energy of battery (only 0.2% tasks are processed in shadowed satellites). As shown in Fig. 7c, PHOENIX can satisfy the deadline requirement while OEC, MHSPO and L2D2 may miss deadline. As L2D2 offloads tasks to ground stations directly without on-board computing, it consumes the least battery energy, but the task completion time is very high due to the downlink bottleneck. Overall, PHOENIX can not only reduce the DoD of batteries, achieving the near-optimal performance, but also satisfy the deadline requirement.

首先，我们在 Starlink 星座、功率为 60W、船舶检测任务的配置下进行对比。如图 7a 所示，PHOENIX 的表现接近 L2D2，与 OEC 和 MHSPO 相比，最大 DoD 分别降低了 54.8%。需要注意的是，一些卫星的 DoD 为 0%，因为它们始终处于受光状态，无需消耗电池能量。由于任务可在地面站、受光卫星或阴影卫星上处理，图 7b 绘制了三类节点的调度决策比例（PSD）。结果显示，PHOENIX 在两个方面优于其他星上计算策略：(i) 它能够协同利用地面站的通信能力与卫星的星上计算能力；(ii) 它能充分利用受光卫星处理任务（99.1% 的任务在受光卫星上处理），同时降低阴影卫星的能耗（仅 0.2% 的任务在阴影卫星上处理），以节省电池电量。如图 7c 所示，PHOENIX 能够满足任务时限要求，而 OEC、MHSPO 与 L2D2 均可能超时。由于 L2D2 直接将任务卸载至地面站而无星上计算，其电池能耗最低，但受下行链路瓶颈限制，任务完成时间很长。总体而言，PHOENIX 不仅能够显著降低电池 DoD，接近最优性能，还能满足任务时限要求。

C. Impact of Seasons and Battery Lifetime Extension¶

As the earth revolves around the sun, the sunlit ratio of each satellite may change over a year. Thus, we repeat the above experiments under the configuration of four seasons, comparing the average DoD in Fig. 8a and estimated lifetime in Fig. 8b. As shown in Fig. 8a, PHOENIX is close to L2D2 and outperforms OEC/MHSPO in any season. Meanwhile, we find that the DoD of all strategies in summer and winter is lower than that in spring and autumn. This is because the angles between some inclined orbits and sunlight are nearly vertical in summer and winter, which can offer larger sunlit ratio. This phenomenon also indicates that the sunlight is an important factor in energy optimization. Following the life model in [23], we estimate the battery lifetime of each satellite when applying different strategies. As shown in Fig. 8b, PHOENIX is close to L2D2 and can prolong the satellite lifespan up to 2.9 × /5.3× as compared with OEC/MHSPO. This is because PHOENIX is sunlight-aware and exploits the sunlit satellites to process tasks, which can significantly reduce the battery energy consumption.

由于地球围绕太阳公转，卫星的受光比例在一年中会发生变化。因此，我们在四季条件下重复上述实验，并在图 8a、图 8b 分别对比平均 DoD 和估算寿命。图 8a 显示，PHOENIX 表现接近 L2D2，并在任何季节都优于 OEC/MHSPO。同时，我们发现夏季与冬季所有策略的 DoD 均低于春秋季。这是因为在夏冬季，部分倾斜轨道与阳光的夹角接近垂直，从而获得更高受光比例。这一现象也表明阳光是能量优化的重要因素。依据 [23] 的寿命模型，我们估算了不同策略下卫星的电池寿命。如图 8b 所示，PHOENIX 表现接近 L2D2，与 OEC/MHSPO 相比，卫星寿命可延长至 2.9× / 5.3×。这得益于 PHOENIX 的受光感知调度策略，能够优先利用受光卫星处理任务，从而显著降低电池能耗。

D. Impact of Constellation Parameters¶

The sunlit ratio is also correlated to the parameters of constellations (e.g., inclination, altitude) as mentioned in §III-A. We simulate OneWeb constellation in our testbed and compare the performance of four strategies. As shown in Fig. 9a, PHOENIX is close to L2D2 and outperforms OEC/MHSPO by up to 15.7%/37.4% on average DoD. Different from inclined orbit, the DoD in summer and winter is larger than that in spring and autumn. This is because polar orbit gets more sunlit ratio in spring and autumn, which is opposite to the inclined orbit. Fig. 9b plots the lifetime of satellites, which shows that PHOENIX can achieve longer lifetime to 2.3 × /4.4× as compared with OEC/MHSPO. When applying PHOENIX under different constellation configurations, OneWeb can achieve better performance than Starlink with lower DoD (12.8% on average) and longer lifetime (1.9× on average) for two key reasons: (i) polar orbit can experience longer sunlight duration than inclined orbit; (ii) the altitude of OneWeb is higher than that of Starlink, which can obtain higher sunlit ratio.

§III-A 中提到，受光比例还与星座参数（如倾角、轨道高度）相关。我们在测试平台中模拟 OneWeb 星座，并对比四种策略的性能。如图 9a 所示，PHOENIX 的表现接近 L2D2，与 OEC/MHSPO 相比，平均 DoD 分别降低 15.7%/37.4%。不同于倾斜轨道，极轨在夏冬季的 DoD 高于春秋季，这是因为极轨在春秋季的受光比例更高，与倾斜轨道呈现相反趋势。图 9b 显示了卫星寿命的比较结果，PHOENIX 可将寿命延长至 OEC/MHSPO 的 2.3× / 4.4×。

在不同星座配置下，OneWeb 的性能优于 Starlink，表现为更低的平均 DoD（低 12.8%）和更长的寿命（高 1.9×），主要有两点原因：

(i) 极轨相较于倾轨拥有更长的受光时间

(ii) OneWeb 的轨道高度高于 Starlink，从而获得更高受光比例

E. Impact of Various Capabilities and Workloads¶

Finally, we adjust the power of Developer Kit from 30W to 60W and apply wildfire segmentation task to explore the performance under various computation capabilities and workloads. TABLE I shows the average DoD of each strategy under different power levels and workloads. Results show that PHOENIX is close to L2D2 and outperforms others under various computation capabilities and workloads, which indicates the robustness of PHOENIX. The DoD of L2D2 remains unchanged due to no on-board processing. When the power increases, the DoD becomes smaller for PHOENIX, which is somewhat counterintuitive. This is because the processing time of tasks gets shorter with strengthened computation capability. The energy is relative to both power and time, thus the energy consumed by a ship detection (wildfire segmentation) task is 300J/250J/180J (3600J/3350J/3060J) under 30W/50W/60W respectively. This inspires us that even if we promote the computation capability, the battery energy consumption can be reduced via proper task scheduling.

最后，我们将开发套件的功率从 30W 调整至 60W，并应用野火分割任务，以评估不同计算能力与工作负载下的性能。表 I 给出了不同功率和工作负载下各策略的平均 DoD。结果表明，PHOENIX 在各种计算能力与工作负载下表现均接近 L2D2，且优于其他策略，体现了 PHOENIX 的稳健性。由于 L2D2 无星上处理，其 DoD 保持不变。当功率提升时，PHOENIX 的 DoD 反而减小，这一现象在直观上可能令人意外。这是因为计算能力增强后，任务处理时间缩短，而能耗与功率和时间均相关。因此，在 30W/50W/60W 下，船舶检测任务（野火分割任务）的能耗分别为 300J/250J/180J（3600J/3350J/3060J）。这启发我们，即便提高计算能力，通过合理的任务调度依然可以降低电池能耗。