跳转至

Runtime Architecture

Overview

alt text

Apiserver

The apiserver, as its name suggests, stores data as API objects (e.g., digi models) and exposes them via standard REST APIs. We reuse the k8s apiserver [10] as we find it a natural choice for hosting attribute-value pairs and allows them to be accessible over REST APIs. The apiserver stores the attribute-value pairs in a persistent key-value store [24]. It exposes an asynchronous Watch API that allows one to subscribe to changes in a model, in addition to standard CRUD operations. Following Kubernetes’s convention, we refer to these APIs as verbs.

正如其名称所示,apiserver 将数据存储为 API 对象(例如 digi 模型)并通过标准的 REST API 对外暴露。我们重用了 Kubernetes 的 apiserver [10],因为它在托管属性-值对方面是一个自然的选择,并允许通过 REST API 访问这些数据。apiserver 将属性-值对存储在一个持久化的键值存储中 [24]。除了标准的 CRUD 操作外,它还提供了一个异步的 Watch API,允许用户订阅模型的变化。按照 Kubernetes 的惯例,我们将这些 API 称为动词(verbs)。

Before serving and/or executing the verb requests, the apiserver runs a series of checks on whether the verb is valid, including correctness of syntax and semantics, sufficient access rights from the caller, whether the proposed changes violate composition rules etc. The last check is performed by the dSpace’s admission webhook (as part of the mounter dSpace controller) which we will describe next.

在提供服务和/或执行动词请求之前,apiserver 会进行一系列检查,以验证动词的有效性,包括语法和语义的正确性、调用者是否拥有足够的访问权限,以及提议的更改是否违反了组合规则等。最后一项检查由 dSpace 的准入 webhook(作为 mounter dSpace 控制器的一部分)执行,我们将在后文中进行详细描述。

dSpace Controllers

Kubernetes provides controllers such as the scheduler, deployment controller, and autoscaler to support container orchestration. While these controllers are sufficient for deploying the digis, there are no existing controllers or mechanisms that provide the support for composition and policy dSpace aims to offer. Hence, we implemented a collection of dSpace controllers to support composition and policy.

Kubernetes 提供了调度器、部署控制器和自动扩缩器等控制器来支持容器编排。虽然这些控制器足以部署 digis,但目前没有任何现有的控制器或机制能够支持 dSpace 所需的组合和策略功能。因此,我们实现了一系列 dSpace 控制器,以支持组合和策略功能。

Mounter(挂载器)

当 digi A 挂载到 B 上时,挂载控制器(简称 mounter)会在运行时同步它们模型之间的状态,即 A 的模型 (𝑀𝐴) 和 B 的模型 (𝑀𝐵)。其具体实现如下:

  1. 当挂载器首次在 𝑀𝐵 上检测到对 A 的挂载引用时,会将 𝑀𝐴 复制到 𝑀𝐵 的挂载属性下。我们将这个被复制的 𝑀𝐴 称为模型副本,它作为 𝑀𝐵 的一部分存储在 A 的挂载引用下。
  2. 当 𝑀𝐴 发生更改时,挂载器会将更新同步到 𝑀𝐵 中的模型副本。
  3. 当模型副本发生更改时,挂载器会将更新同步回 𝑀𝐴。

需要注意的是,挂载器不会将控制属性中 .status 字段的更新从模型副本同步到 𝑀𝐴,因为状态信息不应向下传递。但 .intent 字段的更新会从 𝑀𝐴 同步到模型副本,以便支持意图调和(intent reconciliation)。

此外,在 dSpace 中,每个模型包含一个版本号,每当模型更新时,该版本号都会递增。版本号也会被复制到模型副本中。dSpace 确保 𝑀𝐴 的状态不会被来自父模型(𝑀𝐵)的过时状态覆盖。为此,挂载器会比较 𝑀𝐴 的版本号与其模型副本(在 𝑀𝐵 中)的版本号,只有当副本的版本号不小于 𝑀𝐴 的版本号时,才会同步副本的状态。最后,挂载器根据挂载语义、释放(yield)和挂载模式(隐藏/显示)同步相应状态。

Mounter When digi A is mounted to B, the mount controller (or mounter for short) synchronizes the states between their models at runtime, i.e., between the model of A (𝑀 𝐴 ) and model of B (𝑀 𝐵 ). This is done by, briefly: (i) when the mounter first sees a mount reference to A appear on the 𝑀 𝐵 , it copies 𝑀 𝐴 under the mount attribute of 𝑀 𝐵 . We refer to the copiedover 𝑀 𝐴 as a model replica. It is stored as part of 𝑀 𝐵 under A’s mount reference. (ii) When 𝑀 𝐴 is changed, the mounter syncs the updates to the model replica in 𝑀 𝐵 ; (iii) when the model replica is changed, the mounter syncs the updates to 𝑀 𝐴 . Note that the mounter will not sync any updates on .status fields in the control attributes from the model replica to 𝑀 𝐴 since the status information should never flow southbound. It will, however, sync .intent updates from 𝑀𝐴 to the model replica to allow intent reconciliation.

Further, in dSpace, each model contains a version number5 that is incremented whenever the model is updated. The version number is also copied over to the model replica. dSpace ensures that the states in 𝑀 𝐴 won’t get overwritten by any outdated states from its parent/𝑀 𝐵 . To do so, the mounter compares the version number of 𝑀 𝐴 to that of 𝑀 𝐴 ’s model replica (in 𝑀 𝐵 ) and syncs the states of the replica only when the replica’s version number is no less than the version number of 𝑀 𝐴 . Finally, the mounter implements the rest of mount semantics, yield and mount modes (hide/expose), by syncing the states accordingly.

.status and .intent的合法传递方向

状态流向控制

这里涉及两个关键概念:

  • status字段:表示当前实际状态的信息
  • intent字段:表示期望达到的目标状态

单向同步的原因

状态信息不向下同步的设计原理:

  • 状态(status)反映的是设备A的真实状态,只能由A自己产生和更新
  • 如果允许从model replica同步status到MA,可能导致状态信息不一致或错误

意图向上同步的设计原理:

  • intent代表控制指令,需要从MA传播到replica以便进行意图协调
  • 这确保了控制命令可以正确传递和执行

类比解释

这种设计类似于主从设备的控制模式:

  • 从设备(A)的状态只能由自己上报,不能被主设备(B)覆盖
  • 主设备(B)可以下发控制指令(intent)来改变从设备的行为

这种单向流动机制保证了系统状态的一致性和可控性。

个人感觉这一段有点抽象,不太好理解,多看看:

alt text

Syncer(同步器)

同步控制器负责跟踪两个给定模型(一个源模型和一个目标模型),并在它们之间同步状态。同步信息存储在 apiserver 上的 Sync API 对象中。Syncer 实现了基于数据流的组合(使用管道)。当用户调用 dq pipe A.output.x B.input.x 时,dq 会创建一个 Sync 对象,将 A.output.x 作为源,将 B.input.x 作为目标。

Syncer The sync controller tracks two given models, a source and a target, and syncs states between the two models. The sync information is tracked in a Sync API object stored on the apiserver. Syncer implements the data-flow composition with pipe. Whenever a user calls dq pipe A.output.x B.input.x, dq creates a Sync object using A.output.x as the source and B.input.x as the target.

Policer(策略控制器)

类似于 Syncer,策略控制器监视所有 Policy 对象(例如挂载和释放策略),每个 Policy 对象包含策略声明和相关的 digis。Policer 监听这些 digis 的变化,并在触发任何条件时强制执行策略。需要注意的是,Mounter、Syncer 和 Policer 都通过 Kubernetes 的 Watch API [39] 订阅模型的变化,而不是持续轮询更新。

Policer Akin to the Syncer, the policy controller watches all Policy objects (e.g., mount and yield policy) where each Policy object consists of the policy statement and digis involved in the policy. The policer starts watching for changes on these digis and enforces the policy if any of the conditions are triggered. Note that the Mounter, Syncer, and Policer subscribe to model changes via the Kubernetes’s Watch API [39], without constantly polling for updates.

Topology Webhook(拓扑 Webhook)

准入 Webhook [20] 是 Kubernetes 中扩展 apiserver 请求准入过程的一种机制。当 apiserver 接收到请求时,它会将请求转发给注册的 Webhook,由 Webhook 决定是否接受或拒绝请求。dSpace 利用这一机制实现了拓扑 Webhook,用于强制执行多层次结构和单写入者约束(§3)。拓扑 Webhook 追踪 digi-graph 的最新状态,并拒绝任何导致无效 digi-graph 的无效更改(例如无效的挂载/管道请求)。

Topology webhook An admission webhook [20] is a mechanism in Kubernetes to extend the apiserver’s request admission process. When the apiserver receives a request, it will forward the request to a registered webhook and the webhook can decide whether to accept or reject the request. dSpace leverages this mechanism and implements a topology webhook to enforce the multi-hierarchy and single-writer constraints (§3). Topology webhook tracks the latest status of the digi-graph and rejects any invalid changes (e.g., an invalid mount/pipe request) that lead to an invalid digi-graph.

Implementation

Our implementation comprises ≈10.4K lines of code (LoC), 57% in Go for the runtime, 24% in Go for the dq command line (including code generators and digi image support), and 19% in Python for the driver library (built on top of kopf, an open-source k8s operator framework [33]). All dSpace controllers and web hooks use standard APIs to interact with Kubernetes control plane. All digis, dSpace controllers, and policies can be created and/or composed declaratively via standard Kubernetes configuration (yaml), using its command line kubectl, or dq, which provides complementary commands/shortcuts such as run, mount, yield, pipe, build, alias, push, pull etc. to simplify run-time operations and avoid configuration file sprawl [30]. Currently we provide only a Python front-end for driver programming. We expect both higher-level UI/UX support and other driver language support for in future.

我们的实现包含约 10.4K 行代码(LoC),其中 57% 用 Go 语言编写用于运行时,24% 用 Go 语言编写用于 dq 命令行(包括代码生成器和 digi 镜像支持),19% 用 Python 编写用于驱动程序库(基于 kopf 构建,kopf 是一个开源的 Kubernetes 操作框架 [33])。所有 dSpace 控制器和 Webhook 都使用标准 API 与 Kubernetes 控制平面进行交互。所有 digis、dSpace 控制器和策略都可以通过标准 Kubernetes 配置(yaml)声明式地创建和/或组合,使用其命令行工具 kubectl,或 dq,后者提供了互补的命令/快捷方式,如 run、mount、yield、pipe、build、alias、push、pull 等,以简化运行时操作并避免配置文件的膨胀 [30]。目前,我们仅为驱动程序编程提供了 Python 前端。我们预计未来会提供更高层次的 UI/UX 支持以及其他驱动语言的支持。

For system security, we reuse k8s’s access control mechanisms (Service Accounts and its RBAC module [47]). An implication of our k8s-based implementation is that we inherit many of its desirable features with respect to fault handling (e.g., automatic pod restart), availability, persistence, application delivery, CI/CD (digi image), and configuration management (kustomize, kubectl). Finally, between leaf digis and the physical devices, we preserve vendor-specific security mechanisms such as device-level authentication and encryption. dSpace is open source and additional detail about our implementation can be found in our code repository.6

在系统安全方面,我们重用了 Kubernetes 的访问控制机制(服务账户和其 RBAC 模块 [47])。基于 Kubernetes 的实现使我们继承了其在故障处理(例如自动重启 Pod)、可用性、持久性、应用交付、持续集成/持续交付(CI/CD,digi 镜像)以及配置管理(kustomize、kubectl)等方面的许多优良特性。最后,在 leaf digis 和物理设备之间,我们保留了供应商特定的安全机制,例如设备级认证和加密。dSpace 是开源的,关于我们实现的更多细节可以在我们的代码库中找到。