Known Knowns and Unknowns: Near-realtime Earth Observation Via Query Bifurcation in Serval¶

Earth observation satellites, in low Earth orbits, are increasingly approaching near-continuous imaging of the Earth. Today, these satellites capture an image of every part of Earth every few hours. However, the networking capabilities haven’t caught up, and can introduce delays of few hours to days in getting these images to Earth. While this delay is acceptable for delay-tolerant applications like land cover maps, crop type identification, etc., it is unacceptable for latency-sensitive applications like forest fire detection or disaster monitoring. We design Serval to enable near-realtime insights from Earth imagery for latency-sensitive applications despite the networking bottlenecks by leveraging the emerging computational capabilities on the satellites and ground stations. The key challenge for our work stems from the limited computational capabilities and power resources available on a satellite. We solve this challenge by leveraging predictability in satellite orbits to bifurcate computation across satellites and ground stations. We evaluate Serval using trace-driven simulations and hardware emulations on a dataset comprising ten million images captured using the Planet Dove constellation comprising nearly 200 satellites. Serval reduces end-to-end latency for high priority queries from 71.71 hours (incurred by state of the art) to 2 minutes, and 90-th percentile from 149 hours to 47 minutes.

低地球轨道（LEO）的地球观测卫星正日益接近实现对地球的近乎连续成像。如今，这些卫星每隔几小时就能捕捉到地球上每一处的图像。然而，网络能力尚未跟上，导致这些图像传输到地球有数小时到数天的延迟。虽然这种延迟对于土地覆盖图、作物类型识别等容忍延迟的应用来说是可以接受的，但对于森林火灾探测或灾害监测等延迟敏感型应用而言，这是不可接受的。我们设计了 Serval 系统，旨在通过利用卫星和地面站上新兴的计算能力，为延迟敏感型应用提供近实时的地球影像洞察，以克服网络瓶颈。我们工作面临的关键挑战源于卫星上有限的计算能力和电力资源。我们通过利用卫星轨道的可预测性，在卫星和地面站之间进行计算分叉，从而解决了这一挑战。我们使用由包含近200颗卫星的Planet Dove星座捕获的千万张图像数据集，通过踪迹驱动的模拟和硬件仿真对Serval进行了评估。Serval将高优先级查询的端到端中位数延迟从（现有技术的）71.71小时缩短至2分钟，并将第90百分位延迟从149小时缩短至47分钟。

Introduction¶

Low Earth Orbit (LEO) satellites promise to deliver continuous, high-resolution imagery of the Earth through large constellations of low cost cubesats. These constellations, e.g., Planet Dove [50], deploy imaging sensors on cubesats in low orbits nearly 500 Kilometers above the Earth’s surface. Due to their low orbits and large constellation size, they can capture an image of every location on Earth multiple times per day. The imagery from these satellites is useful for many applications such as disaster monitoring [11, 18], precision agriculture [9, 40], disease modeling [14], climate monitoring [64], and financial analytics [62].

However, Earth observation constellations today cannot support many latency sensitive applications because they suffer from large latency of few hours to days between an image capture and its availability to the end user [59]. For example, a fire department needs the images of a wildfire within a few minutes so as to limit risks to human lives, forest ecosystems, and property. Such delays largely arise from a satellite’s data transfer process (see Fig. 1). Satellite imagery must be transported to ground stations on Earth, and from there to the cloud, where processing, storage, and insight generation can occur. This pipeline can incur a delay of hours and sometimes days. This is due to several factors: (a) orbital dynamics, as satellites have intermittent access to ground stations on Earth—about ten minutes per contact, with 4-5 good quality contacts per day; and (b) the bandwidth from satellite to ground station is limited due to the large distance from satellite to Earth, making it nearly impossible to downlink all the images from a satellite during every contact.

低地球轨道（LEO）卫星有望通过由低成本立方体卫星（CubeSat）组成的大型星座，提供连续、高分辨率的地球影像。这些星座，例如Planet Dove [50]，将成像传感器部署在距离地球表面近500公里的低轨道立方体卫星上。由于其轨道低、星座规模大，它们每天可以多次捕捉地球上任何一个位置的图像。这些卫星影像对于灾害监测 [11, 18]、精准农业 [9, 40]、疾病建模 [14]、气候监测 [64] 和金融分析 [62] 等众多应用都非常有价值。

alt text

然而， 如今的地球观测星座无法支持许多延迟敏感型应用 ，因为从图像捕获到最终用户可用的过程中存在数小时甚至数天的巨大延迟 [59]。例如，消防部门需要在几分钟内获得野火的影像，以限制对人类生命、森林生态系统和财产的风险。这种延迟主要源于卫星的数据传输过程（见图1）。

"卫星影像"必须被传输到地球上的"地面站"，再从那里传输到"云服务器"进行处理、存储和洞察生成。这个流程可能导致数小时甚至数天的延迟。这由几个因素造成：

(a) 轨道动力学，因为卫星与地面站的连接是间歇性的——每次接触约十分钟，每天只有4-5次高质量接触

(b) 由于卫星到地球的距离很远，从卫星到地面站的带宽有限，导致在每次接触期间几乎不可能下传卫星上的所有图像

This paper’s goal is to enable near-real-time insights from satellite imagery by reducing the time-to-insight for satellite imagery to O(minutes). We do so by by leveraging emerging compute capabilities on satellites and ground stations. This is feasible today due to the ongoing push to equip satellites with small amounts of compute resources such as a Raspberry Pi or a NVIDIA Jetson, e.g., in 2020, the European Space Agency (ESA) deployed a neural-network based cloud detector on their Φ-Sat-1 mission [28]. Similarly, many recent proposals from industry [7, 46] and academia [60, 61] argue for colocating ground stations and data centers to reduce terrestrial networking delays and enable compute on ground stations. Our core idea is to leverage the emerging general-purpose computational capabilities available on satellites and ground stations to prioritize latency-sensitive images such as those containing forest fires, while deprioritizing other images that are relatively less latency-sensitive.

To achieve this goal, we build Serval 1 , a novel edge computing framework designed to derive near-real-time insights from LEO satellites. Serval allows multiple long-running queries to execute simultaneously on incoming satellite imagery. Given a query set, Serval intelligently distributes compute across satellite, ground station, and the datacenter. Similar to comodity products for satellite imagery analysis such as Planet Analytics [1], Serval represents each query as a logical intersection of a sequence of filters, e.g. “forest fires in California” is denoted as three sequential filters {California, forest, fire}, each of which involves geographical or statistical computation (e.g., a neural network). Different queries may have different latency-sensitivity and compute requirements. Unlike past work [23], Serval does not discard any images because new applications can emerge post-collection (e.g., historical data analysis or disaster management). For example, recently Planet [51], a leading Earth observation company, used their satellite imagery to retroactively track the origin and flight of a balloon that entered the United States airspace [5], which would be impossible if images had been discarded. Instead, Serval focuses on dynamically reordering image delivery to reduce end-to-end delays for latency-sensitive content.

本文的目标是通过将卫星影像的洞察时间缩短至分钟级别（O(minutes)），从而实现近实时的卫星影像洞察。我们通过利用卫星和地面站上新兴的计算能力来实现这一点。这在今天是可行的，因为业界正推动为卫星配备少量计算资源，如树莓派（Raspberry Pi）或英伟达Jetson（NVIDIA Jetson）。例如，2020年，欧洲航天局（ESA）在其Φ-Sat-1任务中部署了一个基于神经网络的云检测器 [28]。同样，工业界 [7, 46] 和学术界 [60, 61] 的许多近期提议都主张将地面站和数据中心并置，以减少地面网络延迟，并在地面站上实现计算。我们的核心思想是利用卫星和地面站上新兴的通用计算能力，优先处理如包含森林火灾等延迟敏感的图像，同时降低其他相对不敏感图像的优先级。

为实现此目标，我们构建了 Serval，一个旨在从LEO卫星获取近实时洞察的新型边缘计算框架。

Serval允许多个长期运行的查询在传入的卫星影像上同时执行。
给定一个查询集，Serval会在卫星、地面站和数据中心之间智能地分配计算任务。
与Planet Analytics [1]等卫星影像分析的商业产品类似，Serval将每个查询表示为一系列过滤器的逻辑交集，例如，“加利福尼亚的森林火灾”被表示为三个顺序过滤器{加利福尼亚，森林，火灾}，每个过滤器都涉及地理或统计计算（例如，一个神经网络）。
不同的查询可能有不同的延迟敏感度和计算需求。
与以往的工作 [23] 不同，Serval不丢弃任何图像，因为新的应用可能会在数据收集后出现（例如，历史数据分析或灾难管理）。例如，领先的地球观测公司Planet [51] 最近利用其卫星影像追溯了一架进入美国领空的飞艇的来源和飞行轨迹 [5]，如果图像被丢弃，这将是不可能的。
相反，Serval专注于动态重新排序图像的交付，以减少延迟敏感内容的端到端延迟。

Serval专注于动态重新排序图像的交付，以减少延迟敏感内容的端到端延迟

为什么动态重排序可以降低 end2end delay ?

The key challenges in Serval stem from the scale of satellite imagery and the limited compute capabilities available inside a LEO satellite. First, each satellite generates nearly a Terabyte of data per day. Second, a satellite needs to perform the query compute on this data using its limited compute capacity. LEO satellites generally have small solar panels that generates limited power. For instance, the model used in [23] has a 7W solar panel, a large fraction of which is utilized for critical satellite function. However, a Jetson TX2 itself consumes 11.3W. Moreover, solar panel power supply is further diminished because it generates no power when the satellite is on the dark side of the Earth. This means that the computer onboard cannot be always on. Together, this means it is infeasible for all images being collected by the satellite to be processed on-board.

Serval的关键挑战源于卫星影像的巨大规模和LEO卫星内部有限的计算能力。首先，每颗卫星每天产生近1TB的数据。其次，卫星需要用其有限的计算能力对这些数据执行查询计算。LEO卫星通常只有小型太阳能电池板，产生的电力有限。例如，[23]中使用的模型有一个7W的太阳能电池板，其中大部分用于关键的卫星功能。然而，一个Jetson TX2本身就需要消耗11.3W。此外，当卫星处于地球的阴暗面时，太阳能电池板不产生电力，这进一步削弱了电力供应。这意味着星上计算机不能一直处于开启状态。综合来看，这意味着在星上处理所有采集到的图像是不可行的。

Serval’s key insight is based on our observation that a query is typically composed of two kinds of filters, determined by the rate of change of the data the filter pertains to. The data beneath some filters may change quite quickly—we call these as dynamic filters. Examples include (the outline of) fires, (position of) boats, etc. However, the data beneath the second class of filters changes much more slowly—we call these as glacial filters. Examples include forest identification, and ocean and land boundaries—these boundaries do not often change within a day (or even weeks).

Serval bifurcates a query—it assigns the temporally static (glacial) parts of the query to spatially static entities (ground stations, the cloud) while assigning temporally dynamic parts of the query to spatially dynamic entities (satellites). Consider a query such as “forest fires in California”. Serval can decompose this query into identifying “California”, identifying “forests”, and identifying “fire”. In this set, “California" and “forests” are both glacial filters, while “fire” is a dynamic filter. Our key insight is that glacial filters can be pre-computed on the ground stations using stale imagery (e.g., a day-old image). Such glacial filter computation can be done even before an image is captured at the satellite and the results can be conveyed to the satellite. Serval’s bifurcated approach has two advantages. First, the pre-computed glacial filter results on the ground means that the LEO satellite only needs to compute the dynamic filters of a query. Second, the same glacial filter inferences can be reused by multiple satellites (single compute, multiple use). The glacial filter offload to ground stations is enabled by the predictability of satellite orbits and as a result, predictability of the geographical location and time of each image.

Serval的核心洞见基于我们的一个观察：一个查询通常由两种过滤器组成，这取决于过滤器所涉及数据的变化速率。

某些过滤器下的数据可能变化很快——我们称之为动态过滤器（dynamic filters）。例如火灾（的轮廓）、船只（的位置）等。

然而，第二类过滤器下的数据变化则慢得多——我们称之为静态过滤器（glacial filters）。例如森林识别、海洋和陆地边界——这些边界在一天（甚至几周）内通常不会改变。

Serval对查询进行分叉（bifurcates）——它将查询中时间上静态（glacial）的部分分配给空间上静态的实体（地面站、云端），同时将时间上动态的部分分配给空间上动态的实体（卫星）。以“加利福尼亚的森林火灾”这个查询为例。Serval可以将其分解为识别“加利福尼亚”、识别“森林”和识别“火灾”。在这个集合中，“加利福尼亚”和“森林”都是静态过滤器，而“火灾”是动态过滤器。

我们的关键洞见是，静态过滤器可以在地面站上使用过时的影像（例如，一天前的图像）进行预计算。这种静态过滤器的计算甚至可以在卫星捕获图像之前就完成，并将结果传达给卫星。

Serval的分叉方法有两个优点:

首先，地面站上预先计算好的静态过滤器结果意味着 LEO卫星只需要计算查询的动态过滤器部分。其次，同一个静态过滤器的推断结果可以被多颗卫星重复使用（一次计算，多次使用）。将静态过滤器卸载到地面站之所以可行，得益于卫星轨道的可预测性，以及由此带来的每张图像地理位置和时间的可预测性。

For some filters that have to run in real-time, such as cloud detection, we can infer high-quality priors by additionally incorporating auxiliary information available at the ground station, e.g., weather forecast information. Cloud detection is an important component of RGB image analysis because clouds occlude useful information and must be rejected prior to processing. For instance, if the forecasted cloud cover is either very low or very high, Serval can skip the cloud detection step altogether aboard the satellite.

对于一些必须实时运行的过滤器，如云检测，我们可以通过结合地面站可用的辅助信息（例如天气预报信息）来推断出高质量的先验知识。云检测是RGB图像分析的重要组成部分，因为云会遮挡有用信息，必须在处理前被剔除。例如，如果预报的云量非常低或非常高，Serval可以完全跳过在卫星上的云检测步骤。

We evaluate Serval using a combination of trace-driven simulations and hardware emulation on two applications: ‘forest fires in California’, and ‘vessel counting at ports’. These applications represent opposite ends of a spectrum: the former outputs a set of images, while the latter outputs counts. We designed a LEO satellite simulator and evaluated Serval using real image traces collected from PlanetScope, a constellation of over 200 CubeSats launched by Planet Lab. Our paper is, to the best of our knowledge, the largest evaluation performed using data collected by a real operational satellite constellation. Specifically, our traces contain ten million images collected using 151 satellites across 20 days. We evaluate Serval using two different satellite configurations and two different ground station configurations.

我们通过踪迹驱动的模拟和硬件仿真，在两个应用上对Serval进行了评估：“加利福尼亚的森林火灾”和“港口船只计数”。这两个应用代表了一个范围的两端：前者输出一组图像，而后者输出计数。我们设计了一个LEO卫星模拟器，并使用从PlanetScope（一个由Planet Lab发射的、包含超过200颗CubeSat的星座）收集的真实图像踪迹来评估Serval。据我们所知，我们的论文是使用真实运行的卫星星座收集的数据进行的最大规模的评估。具体来说，我们的踪迹包含由151颗卫星在20天内收集的一千万张图像。我们使用两种不同的卫星配置和两种不同的地面站配置对Serval进行了评估。

Contributions: We summarize our contributions below:

• We present the first system that distributes compute across satellites and ground stations to deliver near-real-time insights from Earth observation satellites.

• We propose a new bifurcated query execution approach that offloads glacial (slowly-changing) filter computation to the ground in order to reduce computational load on satellites, and end-to-end latency.

• To the best of our knowledge, we are the first to evaluate our system on a real-world raw and continuous trace collected by the world’s largest LEO satellite constellation for Earth observation.

• Our evaluation shows that the Serval scheduler improves end-to-end median percentile latency on high-priority images from over 70 hours to 2 minutes (90-th percentile from 149 hours to 47 minutes), while also improving detection accuracy and reducing satellite compute load by over 80%.

贡献：我们将我们的贡献总结如下

我们提出了第一个在卫星和地面站之间分配计算任务，以从地球观测卫星提供近实时洞察的系统
我们提出了一种新颖的分叉式查询执行方法，该方法将静态（变化缓慢）的过滤器计算卸载到地面，以减少卫星的计算负载和端到端延迟
据我们所知，我们是第一个使用由世界最大的LEO地球观测卫星星座收集的真实、原始且连续的踪迹来评估我们系统的团队
我们的评估表明，Serval调度器将高优先级图像的端到端中位数延迟从超过70小时改善至2分钟（第90百分位延迟从149小时改善至47分钟），同时还将检测精度提高了，并将卫星计算负载减少了超过80%