跳转至

Mininet: A Network in a Laptop - Rapid Prototyping for Software-Defined Networks

gemini先过

详细介绍了 Mininet 系统 -- 一种基于轻量级虚拟化技术的网络仿真平台, 旨在单台笔记本电脑上快速构建, 交互和分享大规模的软件定义网络 (SDN) 原型

(1) 核心目标与痛点解决

论文指出现有的网络原型开发环境存在两极分化:

  • 硬件测试床 (Hardware Testbeds): 昂贵且难以获取
  • 模拟器 (Simulators, 如 ns-2): 缺乏真实性 (代码无法直接移植到硬件) 且缺乏交互性
  • 虚拟机 (VMs): 资源开销太大, 限制了网络节点的规模 (一台笔记本只能运行少量节点)

Mininet 的目标是提供一个兼具以下特性的环境:

  • 灵活性 (Flexible): 使用软件定义新的拓扑和功能
  • 可部署性 (Deployable): 原型代码无需修改即可运行在真实硬件上
  • 交互性 (Interactive): 可以实时管理和运行网络
  • 可扩展性 (Scalable): 在一台笔记本上支持数百甚至数千个节点
  • 真实性 (Realistic): 运行真实的应用和协议栈
  • 可分享性 (Share-able): 易于打包和分发给协作者

(2) 技术架构

Mininet 利用 Linux 操作系统级别的轻量级虚拟化 功能来实现扩展性:

  • 主机 (Hosts):
    • 使用 Linux 网络命名空间 (Network Namespaces)
    • 每个主机只是一个在独立命名空间中运行的 Shell 进程, 拥有独立的接口, 端口和路由表
  • 链路 (Links):
    • 使用 虚拟以太网对 (Virtual Ethernet pairs / veth pairs)
    • 它们像连接两个虚拟接口的导线, 将数据包从一端传送到另一端
  • 交换机 (Switches):
    • 使用软件 OpenFlow 交换机 (如 Open vSwitch)
    • 支持用户空间或内核空间模式
  • 控制器 (Controllers):
    • 可以运行在任何地方 (虚拟机内, 宿主机或是云端), 只要与交换机有 IP 连通性

(3) 性能与可扩展性

  • 启动速度: 即使是包含 1024 个主机的拓扑, 也能在较短时间内启动 (Tree(32,2) 拓扑耗时约 817.8 秒, 较小的拓扑仅需数秒)
  • 资源开销: 每个主机仅需约 1 MB 的内存开销, 而传统虚拟机通常需要 70 MB
  • 带宽: 通过单个交换机可达 2-3 Gbps, 链式结构下聚合带宽可超 10 Gbps
  • 限制: 主要是高负载下的性能保真度 (CPU 调度无法保证严格的时序) 以及目前仅支持单机运行

(4) Workflow

Mininet 提供了一个完整的从原型到部署的生命周期:

  1. 创建 (Creating): 使用命令行 (mn) 或 Python API 快速定义网络拓扑 (如树形结构)
  2. 交互 (Interacting): 使用 CLI 进行实时控制, 例如 h2 ping h3, 并在运行时注入故障或更改连接
  3. 定制 (Customizing): 通过 Python API 编写脚本, 创建自定义的拓扑和测试逻辑
  4. 分享 (Sharing): 将整个系统打包成虚拟机镜像分发, 实现 "可运行的论文"
  5. 硬件运行 (Running on Hardware): 虚拟组件可直接映射到物理组件, 控制平面代码无需改变

Introduction

Inspiration hits late one night and you arrive at a world-changing idea: a new network architecture, address scheme, mobility protocol, or a feature to add to a router. With a paper deadline approaching, you have a laptop and three months. What prototyping environment should you use to evaluate your idea? With this question in mind, we set out to create a prototyping workflow with the following attributes:

Flexible: new topologies and new functionality should be defined in software, using familiar languages and operating systems.

Deployable: deploying a functionally correct prototype on hardware-based networks and testbeds should require no changes to code or configuration.

Interactive: managing and running the network should occur in real time, as if interacting with a real network.

Scalable: the prototyping environment should scale to networks with hundreds or thousands of switches on only a laptop.

Realistic: prototype behavior should represent real behavior with a high degree of confidence; for example, applications and protocol stacks should be usable without modification.

Shareable: self-contained prototypes should be easily shared with collaborators, who can then run and modify our experiments.

往往在深夜, 灵感突现, 你构思出一个可能改变世界的想法: 一种新的网络架构, 编址方案, 移动性协议, 抑或是路由器的一项新增功能. 面对日益临近的论文截稿日期, 你手头仅有一台笔记本电脑和三个月的时间. 应选用何种原型开发环境来评估该构想? 带着这一问题, 我们致力于构建一种具备以下特性的原型开发工作流:

  • 灵活性 (Flexible): 新的拓扑结构与功能应能通过软件定义, 并使用通用的编程语言及操作系统
  • 可部署性 (Deployable): 将功能正确的原型部署至基于硬件的网络或测试床时, 应无需修改任何代码或配置
  • 交互性 (Interactive): 对网络的管理与运行应实时进行, 如同与真实网络交互一般
  • 可扩展性 (Scalable): 该原型环境应能仅凭一台笔记本电脑, 便扩展至包含数百甚至数千台交换机的网络规模
  • 真实性 (Realistic): 原型行为应能高置信度地反映真实行为! 应用程序及协议栈应无需修改即可直接使用!
  • 可共享性 (Share-able): 独立自包含的原型应易于与合作者分享, 使其能够运行并修改实验

The currently available prototyping environments have their pros and cons. Special-purpose testbeds are expensive and beyond the reach of most researchers. Simulators, such as ns-2 [14] or Opnet [19], are appealing because they can run on a laptop, but they lack realism: the code created in the simulator is not the same code that would be deployed in the real network, and they are not interactive. At first glance, a network of virtual machines (VMs) is appealing. With a VM per switch/router, and a VM per host, realistic topologies can easily be stitched together using virtual interfaces [13, 17, 15]. Our experience is that VMs are too heavyweight: the memory overhead for each VM limits the scale to just a handful of switches and hosts. We want something more scalable.

There are efforts underway to build programmable testbeds (e.g. Emulab [9], VINI [1], GENI [6], FIRE [5]) supporting realistic user traffic, at scale, and with interactive behavior. Our approach is complementary to these systems. We seek a local environment that allows us to quickly implement a functionally correct, well-understood prototype, then directly move it onto shared global infrastructure.

当前可用的原型开发环境各有利弊:

(1) 专用测试床造价昂贵, 且绝大多数研究人员难以触及

(2) 诸如 ns-2 [14] 或 Opnet [19] 之类的模拟器虽因能运行于笔记本电脑上而颇具吸引力, 但缺乏真实性:

模拟器中编写的代码与实际网络部署的代码并不一致, 且不支持交互操作

(3) 初看之下, 虚拟机 (VM) 网络似乎是一个不错的选择:

通过为每台交换机/路由器及每台主机分配一个虚拟机, 并利用虚拟接口进行连接, 可以轻易地拼凑出逼真的拓扑结构 [13, 17, 15]

但我们的经验表明, 虚拟机过于重量级: 每个虚拟机的内存开销将网络规模限制在仅能容纳少量交换机与主机的程度.

我们需要一种更具扩展性的方案!

(4) 现有工作正致力于构建可编程测试床 (例如 Emulab [9], VINI [1], GENI [6], FIRE [5]), 以支持大规模, 具交互性且包含真实用户流量的环境.

我们的方法与这些系统互为补充.

我们寻求一种本地环境, 使我们能够快速实现功能正确且逻辑清晰的原型, 随后将其直接迁移至共享的全球基础设施上!

Mininet – the new prototyping environment described in this paper – supports this workflow by using lightweight virtualization. Users can implement a new network feature or entirely new architecture, test it on large topologies with application traffic, and then deploy the exact same code and test scripts into a real production network. Mininet runs surprisingly well on a single laptop by leveraging Linux features (processes and virtual Ethernet pairs in network namespaces) to launch networks with gigabits of bandwidth and hundreds of nodes (switches, hosts, and controllers). The entire network can be packaged as a VM, so that others can download, run, examine and modify it.

Mininet is far from perfect – performance fidelity and multi-machine support could be improved – but these are limitations of the implementation, not the approach. Other tools also use lightweight virtualization [9, 23] (see Section 7 for a comparison), but Mininet differs in its support for rapidly prototyping Software-Defined Networks, a use case we focus on throughout this paper.

Mininet -- 本文介绍的新型原型开发环境:

通过采用轻量级虚拟化技术来支持这一工作流:

用户可以实现新的网络功能或全新的架构, 在承载应用流量的大型拓扑上进行测试, 继而将完全相同的代码与测试脚本部署到真实的生产网络中

通过利用 Linux 特性 (网络命名空间中的进程与虚拟以太网对), Mininet 在单台笔记本电脑上运行效果惊人, 能够启动包含数百个节点 (交换机, 主机及控制器) 且带宽达千兆级的网络

整个网络可被封装为一个虚拟机, 以便他人下载, 运行, 审视及修改

Mininet 远非完美 -- 其性能保真度与多机支持仍有待提升 -- 但这属于实现层面的局限, 而非方法论本身的缺陷

尽管其他工具也采用了轻量级虚拟化技术 [9, 23] (对比详见第 7 节), 但 Mininet 的独特之处在于其对软件定义网络 (SDN) 快速原型开发的支持, 这也是本文通篇关注的核心用例

Software-defined Networks

In an SDN, the control plane (or "network OS") is separated from the forwarding plane. Typically, the network OS (e.g NOX [8], ONIX [10], or Beacon [2]) observes and controls the entire network state from a central vantage point, hosting features such as routing protocols, access control, network virtualization, energy management, and new prototype features. The network OS controls the forwarding plane via a narrow, vendoragnostic interface, such as OpenFlow [18], which defines the low-level forwarding behavior of each forwarding element (switch, router, access point, or base station). For example, OpenFlow defines a rule for each flow; if a packet matches a rule, the corresponding actions are performed (e.g. drop, forward, modify, or enqueue).

The main consequence of SDN is that the functionality of the network is defined after it has been deployed, under the control of the network owner and operator. New features can be added in software, without modifying the switches, allowing the behavior to evolve at software speeds, rather than at standards-body speed. SDN enables new approaches to state management (anywhere on the spectrum from centralized to distributed) and new uses of packet headers (fields with layer-specific processing become a layer-less sea of bits). Examples of software-defined networks include 4D [7], Ethane [4], PortLand [12], and FlowVisor [22]).

These examples hint at the potential of SDN, but we feel that a rapid prototyping workflow is a key to unlocking the full potential of software-defined networking. The variety of systems prototyped on Mininet supports this assertion, and we describe several such case studies in Section 6.

在 SDN 中, 控制平面 (或称 "网络操作系统") 与转发平面是分离的.

通常, 网络操作系统 (例如 NOX [8], ONIX [10] 或 Beacon [2]) 从全局视角观测并控制整个网络状态, 承载诸如路由协议, 访问控制, 网络虚拟化, 能源管理以及新型原型功能等特性

网络操作系统通过一个窄腰的, 独立于厂商的接口 (如 OpenFlow [18]) 来控制转发平面, 该接口定义了每个转发网元 (交换机, 路由器, 接入点或基站) 的底层转发行为. 例如, OpenFlow 为每个流定义了一条规则;

如果一个数据包匹配了某条规则, 相应的动作 (如丢弃, 转发, 修改或入队) 就会被执行

SDN 带来的主要影响在于, 网络的功能可以在部署之后, 由网络的所有者和运营商控制并定义. 新功能可以通过软件添加, 而无需修改交换机硬件, 这使得网络行为能够以软件迭代的速度演进, 而非受限于标准化组织的制定速度. SDN 赋能了状态管理的新方法 (涵盖从集中式到分布式的整个谱系), 以及数据包头部的新用途 (原本特定于层级处理的字段变成了无层级之分的比特海洋). 软件定义网络的典型范例包括 4D [7], Ethane [4], PortLand [12] 以及 FlowVisor [22].

上述范例初现了 SDN 的潜力, 但我们认为, 一种快速原型开发工作流才是释放软件定义网络全部潜力的关键. 基于 Mininet 进行原型开发的系统种类繁多, 有力地印证了这一观点, 我们在第 6 节中描述了若干此类案例研究.

Mininet Workflow

By combining lightweight virtualization with an extensible CLI and API, Mininet provides a rapid prototyping workflow to create, interact with, customize and share a software-defined network, as well as a smooth path to running on real hardware.

通过结合轻量级虚拟化与可扩展的命令行界面 (CLI) 及 API, Mininet 提供了一种快速原型开发工作流, 用于创建, 交互, 定制和分享软件定义网络, 并提供了一条在真实硬件上运行的平滑迁移路径.

3.1 Creating a Network

The first step is to launch a network using the mn command-line tool. For example, the command

Bash
1
2
mn --switch ovsk --controller nox --topo \
    tree,depth=2,fanout=8 --test pingAll

starts a network of OpenFlow switches. In this example, Open vSwitch [20] kernel switches are connected in a tree topology of depth 2 and fanout 8 (i.e. 9 switches and 64 hosts), under the control of NOX, followed by the pingAll test to check connectivity between every pair of nodes. To create this network, Mininet emulates links, hosts, switches, and controllers. Mininet uses the lightweight virtualization mechanisms built into the Linux OS: processes running in network namespaces, and virtual Ethernet pairs.

Links: A virtual Ethernet pair, or veth pair, acts like a wire connecting two virtual interfaces; packets sent through one interface are delivered to the other, and each interface appears as a fully functional Ethernet port to all system and application software. Veth pairs may be attached to virtual switches such as the Linux bridge or a software OpenFlow switch.

Hosts: Network namespaces [11] are containers for network state. They provide processes (and groups of processes) with exclusive ownership of interfaces, ports, and routing tables (such as ARP and IP). For example, two web servers in two network namespaces can coexist on one system, both listening to private eth0 interfaces on port 80.

A host in Mininet is simply a shell process (e.g. bash) moved into its own network namespace with the unshare(CLONE NEWNET) system call. Each host has its own virtual Ethernet interface(s) (created and installed with ip link add/set) and a pipe to a parent Mininet process, mn, which sends commands and monitors output.

Switches: Software OpenFlow switches provide the same packet delivery semantics that would be provided by a hardware switch. Both user-space and kernel-space switches are available.

Controllers: Controllers can be anywhere on the real or simulated network, as long as the machine on which the switches are running has IP-level connectivity to the controller. For Mininet running in a VM, the controller could run inside the VM, natively on the host machine, or in the cloud.

Figure 1 illustrates the components and connections in a two-host network created with Mininet.

第一步是使用 mn 命令行工具启动网络. 例如, 命令:

Bash
1
2
mn --switch ovsk --controller nox --topo \
    tree,depth=2,fanout=8 --test pingAll

启动了一个由 OpenFlow 交换机组成的网络. 在此示例中, Open vSwitch [20] 内核态交换机连接成深度为 2, 扇出为 8 的树形拓扑 (即 9 台交换机和 64 台主机), 由 NOX 控制, 随后执行 pingAll 测试以检查每对节点之间的连通性. 为创建此网络, Mininet 仿真了链路, 主机, 交换机和控制器.

Mininet 利用了 Linux 操作系统内置的轻量级虚拟化机制: 运行在网络命名空间中的进程, 以及虚拟以太网对 (virtual Ethernet pairs, veth pairs).

链路 (Links):

  • 虚拟以太网对 (或称 veth pair) 的作用类似于连接两个虚拟接口的导线
  • 通过一个接口发送的数据包会被投递到另一个接口, 且每个接口对于所有系统和应用软件而言, 都表现为一个功能完备的以太网端口
  • Veth Pairs 可以挂载到虚拟交换机上, 例如 Linux 网桥或软件 OpenFlow 交换机

主机 (Hosts):

网络命名空间 [11] 是网络状态的容器. 它们为进程 (及进程组) 提供对接口, 端口和路由表 (如 ARP 和 IP) 的独占权

例如, 位于两个不同网络命名空间中的两个 Web 服务器可以共存于同一系统上, 且都监听各自私有 eth0 接口的 80 端口

Mininet 中的主机实际上只是一个通过 unshare(CLONE_NEWNET) 系统调用迁移至自身网络命名空间中的 Shell 进程 (例如 bash)

每个主机拥有自己的虚拟以太网接口 (通过 ip link add/set 创建并安装) 以及通往父 Mininet 进程 mn 的管道, mn 负责发送命令并监控输出

交换机 (Switches):

软件 OpenFlow 交换机提供与硬件交换机相同的数据包投递语义. 用户空间和内核空间交换机均可使用

控制器 (Controllers):

控制器可以位于真实或模拟网络的任何位置, 只要运行交换机的机器与控制器之间具备 IP 级连通性即可

对于在虚拟机中运行的 Mininet, 控制器可以运行在虚拟机内部, 宿主机本机或云端

图 1 展示了利用 Mininet 创建的双主机网络中的组件及连接关系:

alt text

3.2 Interacting with a Network

After launching the network, we want to interact with it: to run commands on hosts, verify switch operation, and maybe induce failures or adjust link connectivity. Mininet includes a network-aware command line interface (CLI) to allow developers to control and manage an entire network from a single console. Since the CLI is aware of node names and network configuration, it can automatically substitute host IP addresses for host names. For example, the CLI command

Bash
1
mininet> h2 ping h3

tells host h2 to ping host h3's IP address. This command is piped to the bash process emulating host 2, causing an ICMP echo request to leave h2's private eth0 network interface and enter the kernel through a veth pair. The request is processed by a switch in the root namespace, then exits back out a different veth pair to the other host. If the packet needed to traverse multiple switches, it would stay in the kernel without additional copies; in the case of a user-space switch, the packet would incur user-space transitions on each hop. In addition to acting as a terminal multiplexer for hosts, the CLI provides a variety of built-in commands and can also evaluate Python expressions.

启动网络后, 我们希望与之交互: 在主机上运行命令, 验证交换机操作, 甚至可能注入故障或调整链路连接. Mininet 包含一个网络感知型命令行界面 (CLI), 允许开发人员从单一控制台控制和管理整个网络. 由于 CLI 感知节点名称和网络配置, 它可以自动将主机名替换为主机 IP 地址. 例如, CLI 命令:

Bash
1
mininet> h2 ping h3

指示主机 h2 ping 主机 h3 的 IP 地址.

该命令通过管道传输给仿真主机 2 的 bash 进程, 致使一个 ICMP 回显请求从 h2 的私有 eth0 网络接口发出, 并通过 veth 对进入内核

该请求由根命名空间中的交换机处理, 随后通过另一对 veth 对传出至另一台主机

若数据包需要穿越多个交换机, 它将停留在内核中而无需额外的拷贝!

若是用户空间交换机, 数据包在每一跳都会产生用户空间转换开销. 除了充当主机的终端复用器外, CLI 还提供了多种内置命令, 并能执行 Python 表达式

3.3 Customizing a Network

Mininet exports a Python API to create custom experiments, topologies, and node types: switch, controller, host, or other. A few lines of Python are sufficient to define a custom regression test that creates a network, executes commands on multiple nodes, and displays the results. An example script:

Python
1
2
3
4
5
6
7
8
9
from mininet.net import Mininet
from mininet.topolib import TreeTopo

tree4 = TreeTopo(depth=2, fanout=2)
net = Mininet(topo=tree4)
net.start()
h1, h4 = net.hosts[0], net.hosts[3]
print h1.cmd('ping -c1 %s' % h4.IP())
net.stop()

creates a small network (4 hosts, 3 switches) and pings one host from another, in about 4 seconds.

The current Mininet distribution includes several example applications, including text-based scripts and graphical applications, two of which are shown in figures 2 and 3. The hope is that the Mininet API will prove useful for system-level testing and experimentation, test network management, instructional materials, and applications that will surprise the authors.

Mininet 导出了一个 Python API, 用于创建自定义实验, 拓扑以及节点类型: 交换机, 控制器, 主机或其他. 寥寥几行 Python 代码足以定义一个自定义回归测试, 该测试可创建网络, 在多个节点上执行命令并显示结果. 示例如下:

Python
1
2
3
4
5
6
7
8
9
from mininet.net import Mininet
from mininet.topolib import TreeTopo

tree4 = TreeTopo(depth=2, fanout=2)
net = Mininet(topo=tree4)
net.start()
h1, h4 = net.hosts[0], net.hosts[3]
print h1.cmd('ping -c1 %s' % h4.IP())
net.stop()

上述代码创建了一个小规模网络 (4 台主机, 3 台交换机) 并从一台主机 ping 另一台主机, 耗时约 4 秒.

alt text

当前的 Mininet 发行版包含若干示例应用程序, 包括基于文本的脚本和图形化应用程序, 其中两个分别展示在图 2 和图 3 中. 我们期望 Mininet API 能在系统级测试与实验, 测试网络管理, 教学材料以及那些令作者耳目一新的应用中发挥作用.

3.4 Sharing a Network

Mininet is distributed as a VM with all dependencies pre-installed, runnable on common virtual machine monitors such as VMware, Xen and VirtualBox. The virtual machine provides a convenient container for distribution; once a prototype has been developed, the VM image may be distributed to others to run, examine and modify. A complete, compressed Mininet VM is about 800 MB. Mininet can also be installed natively on Linux distributions that ship with CONFIG NET NS enabled, such as Ubuntu 10.04, without replacing the kernel.

Mininet 以预装所有依赖项的虚拟机形式分发, 可在常见的虚拟机监视器 (如 VMware, Xen 和 VirtualBox) 上运行. 虚拟机为分发提供了一个便捷的容器; 一旦原型开发完成, VM 镜像即可分发给他人运行, 审视及修改.

一个完整且经压缩的 Mininet VM 大小约为 800 MB. Mininet 也可以直接安装在启用了 CONFIG_NET_NS 的 Linux 发行版 (如 Ubuntu 10.04) 上, 而无需更换内核.

3.5 Running on Hardware

To successfully port to hardware on the first try, every Mininet-emulated component must act in the same way as its corresponding physical one. The virtual topology should match the physical one; virtual Ethernet pairs must be replaced by link-level Ethernet connectivity. Hosts emulated as processes should be replaced by hosts with their own OS image. In addition, each emulated OpenFlow switch should be replaced by a physical one configured to point to the controller. However, the controller does not need to change. When Mininet is running, the controller "sees" a physical network of switches, made possible by an interface with well-defined state semantics. With proxy objects representing OpenFlow datapaths on physical switches and SSH servers on physical hosts, the CLI enables interaction with the network in the same way as before, with unmodified test scripts.

为了首次尝试即成功移植到硬件, 每个 Mininet 仿真的组件必须与其对应的物理组件行为一致.

  1. 虚拟拓扑应与物理拓扑匹配
  2. 虚拟以太网对必须替换为链路级以太网连接
  3. 以进程形式仿真的主机应替换为拥有独立 OS 镜像的主机
  4. 每个仿真的 OpenFlow 交换机应替换为指向该控制器的物理交换机

一个很大的优点是: 控制器无需更改!

当 Mininet 运行时, 控制器 "看到" 的是一个由交换机组成的物理网络, 这得益于具有明确定义状态语义的接口

通过代表物理交换机上 OpenFlow 数据通路的代理对象以及物理主机上的 SSH 服务器, CLI 能够以与之前相同的方式与网络交互, 且无需修改测试脚本

Scalability

Lightweight virtualization is the key to scaling to hundreds of nodes while preserving interactive performance. In this section, we measure overall topology creation times, available bandwidth, and micro-benchmarks for individual operations.

Table 2 shows the time required to create a variety of topologies with Mininet. Larger topologies which cannot fit in memory with system virtualization can start up on Mininet. In practice, waiting 10 seconds for a full fat tree to start is quite reasonable (and faster than the boot time for hardware switches).

Mininet scales to the large topologies shown (over 1000 hosts) because it virtualizes less and shares more. The file system, user ID space, process ID space, kernel, device drivers, shared libraries and other common code are shared between processes and managed by the operating system. The roughly 1 MB overhead for a host is the memory cost of a shell process and small network namespace state; this total is almost two orders of magnitude less than the 70 MB required per host for the memory image and translation state of a lean VM. In fact, of the topologies shown in Table 2, only the smallest one would fit in the memory of a typical laptop if system virtualization were used. Mininet also provides a usable amount of bandwidth, as shown in Table 1: 2-3 Gbps through one switch, or more than 10 Gbps aggregate internal bandwidth through a chain of 100 switches.

Table 3 shows the time consumed by individual operations when building a topology. Surprisingly, link addition and deletion are expensive operations, taking roughly 250 ms and 400 ms, respectively. As we gain a better understanding of Mininet's resource usage and interaction with the Linux kernel, we hope to further improve its performance and contribute optimizations back to the kernel as well as Open vSwitch.

轻量级虚拟化是在保持交互性能的同时将规模扩展至数百个节点的关键所在. 本节将对整体拓扑创建时间, 可用带宽以及针对单个操作的微基准测试 (micro-benchmarks) 进行评估.

表 2 展示了使用 Mininet 创建多种不同拓扑结构所需的时间:

alt text

那些因内存限制而无法在系统级虚拟化环境中运行的大型拓扑, 可以在 Mininet 上成功启动. 在实践中, 等待 10 秒以启动一个完整的胖树 (Fat Tree) 拓扑是相当合理的 (且该速度快于硬件交换机的启动时间).

Mininet 之所以能扩展至所示的大型拓扑 (超过 1000 台主机), 归功于其 "少虚拟化, 多共享" 的机制.

文件系统, 用户 ID 空间, 进程 ID 空间, 内核, 设备驱动程序, 共享库及其他通用代码均在进程间共享, 并由操作系统统一管理.

单台主机约 1 MB 的开销主要源于 Shell 进程及少量网络命名空间状态的内存占用; 这一数值比精简虚拟机 (lean VM) 所需的 70 MB (用于内存镜像和地址翻译状态) 低了近两个数量级.

事实上, 若采用系统级虚拟化, 表 2 中所示的拓扑结构仅有最小规模的那一个能够适配典型笔记本电脑的内存容量.

如表 1 所示, Mininet 同样提供了可观的带宽:

alt text

经由单个交换机的吞吐量可达 2-3 Gbps, 而在 100 台交换机组成的链路上, 聚合内部带宽可超过 10 Gbps

表 3 展示了构建拓扑时各项单独操作所消耗的时间:

alt text

令人惊讶的是, 链路的添加与删除属于高开销操作, 分别耗时约 250 毫秒和 400 毫秒.

随着我们对 Mininet 资源占用及其与 Linux 内核交互机制理解的深入, 我们期望进一步提升其性能, 并将优化成果回馈给内核社区及 Open vSwitch 项目.

Limitations

The most significant limitation of Mininet today is a lack of performance fidelity, especially at high loads. CPU resources are multiplexed in time by the default Linux scheduler, which provides no guarantee that a host that is ready to send a packet will be scheduled promptly, or that all switches will forward at the same rate. In addition, software forwarding may not match hardware. O(n) linear lookup for software tables cannot approach the O(1) lookup of a hardware-accelerated TCAM in a vendor switch, causing the packet forwarding rate to drop for large wildcard table sizes.

To enforce bandwidth limits and quality of service on a link, the linux traffic control program (tc) may be used. Linux CPU containers and scheduler priorities offer additional options for improving fairness. Mininet currently runs on a single machine and emulates only wired links; as with performance fidelity, these limitations do not seem fundamental, and we expect to address them later.

Mininet's partial virtualization approach also limits what it can do. It cannot handle different OS kernels simultaneously. All hosts share the same filesystem, although this can be changed by using chroot. Hosts cannot be migrated live like VMs. We feel that these losses are a reasonable tradeoff for the ability to try ideas at greater scale.

目前 Mininet 最显著的局限性在于缺乏性能保真度 (performance fidelity), 尤其是在高负载环境下!

CPU 资源由默认的 Linux 调度器进行时分复用:

该机制无法保证准备发送数据包的主机能够被及时调度, 也无法保证所有交换机以相同的速率进行转发

此外, 软件转发性能可能无法与硬件相媲美:

软件流表的 O(n) 线性查找, 无法企及商用交换机中硬件加速 TCAM (三态内容寻址存储器) 的 O(1) 查找效率, 这导致在通配符表项规模较大时, 数据包转发率会下降

若需在链路上实施带宽限制及服务质量 (QoS) 控制, 可使用 Linux 流量控制程序 (tc):

Linux CPU 容器及调度器优先级设置为改善公平性提供了额外的选项.

Mininet 目前仅在单机上运行且仅仿真有线链路; 与性能保真度问题类似, 这些局限性并非本质性的, 我们期望在后续工作中予以解决.

Mininet 的部分虚拟化 (partial virtualization) 方法也对其功能构成了一定限制:

它无法同时运行不同的操作系统内核. 所有主机共享同一个文件系统, 尽管可以通过使用 chroot 来改变这一状况.

主机无法像虚拟机那样支持实时迁移 (live migration).

我们认为, 为了换取在更大规模上验证构想的能力, 这些功能上的牺牲是合理的权衡.