跳转至

Peering Between Clouds

The intercloud layer is designed to run jobs on the cloud (or clouds) that best meets their needs. If the job involves large data-sets, as many of the common cloud workloads do, this will require moving the data to the cloud where the computation will occur. Today, most clouds have pricing policies where moving data into a cloud is much cheaper than moving it out. For instance, ingesting data into AWS is free while transferring data out of AWS can cost between 0.050.09 $ /GB, the cost of storing a GB of data for several months! This is significantly more than the cost of streaming data from some leading CDNs, which ranges form 0.009-0.02 $ /GB for 150TB/month or more.

Intercloud层的设计目的是在最适合的云(或多个云)上运行作业。如果作业涉及大型数据集(许多常见的云工作负载都属于这种情况),这将需要将数据移动到进行计算的云端。如今,大多数云服务提供商的定价政策是,将数据导入云端比将数据移出云端要便宜得多。例如,将数据导入AWS是免费的,而将数据从AWS移出则可能花费0.05-0.09美元/GB,这相当于存储一个GB的数据几个月的成本1!这远远高于一些领先的CDN的流数据费用,这些费用对于每月150TB或更多的数据量,仅为0.009-0.02美元/GB。

CDN

CDN(内容分发网络,Content Delivery Network)是一种分布式网络系统,通过在多个地理位置部署服务器,将内容(如网页、视频、图像等)快速地传送给用户。其主要目的是减少内容传输的延迟、提升用户的访问速度和体验,同时减轻源服务器的负载压力。

CDN在用户与原始服务器之间缓存数据,当用户请求内容时,CDN会从距离用户最近的服务器提供内容,从而加快传输速度。此外,CDN还可以优化带宽使用,降低传输成本,尤其是在大量用户同时访问的情况下。

在你提到的场景中,CDN流媒体的成本指的是通过这些分布式服务器网络传输数据的费用,相比于云服务提供商(如AWS)将数据移出的费用,CDN的流媒体传输费用通常更低。

TL;DR: 就是传统意义上的分布式网络系统

We will call this form of pricing “data gravity” pricing, and it creates a strong incentive for users to process data in the same cloud in which it currently resides. Still, moving data from one cloud to another can still be the most cost-effective option, especially for jobs where the computation resources are much more expensive than the data transfer costs. For example, consider ImageNet training which involve a 150GB dataset. It costs about $13 to transfer it out of AWS, but, according to the DAWNBench 2 , it costs over $40 to train ResNet50 on ImageNet on AWS compared to about $20 to train the same model on Azure. Given these numbers, it would be cheaper to move the data from AWS and perform training on Azure, instead of performing the training in AWS where the data is. Thus, while data gravity pricing does inhibit(抑制) moving jobs, in some cases moving jobs is still worthwhile.

我们将这种定价方式称为“数据引力”定价,它强烈激励用户在数据当前所在的云上处理数据。然而,将数据从一个云迁移到另一个云仍然可能是最具成本效益的选择,特别是对于计算资源成本远高于数据传输成本的作业。举个例子,考虑ImageNet的训练,数据集为150GB。从AWS传输出去大约需要13美元,但根据DAWNBench 2的数据显示,在AWS上训练ResNet50的成本超过40美元,而在Azure上训练相同模型只需20美元。鉴于这些数字,将数据从AWS移出并在Azure上进行训练会更便宜。因此,尽管“数据引力”定价确实抑制了作业的迁移,但在某些情况下,迁移作业依然是值得的。

In addition, the incentives against job movement are most acute if the data is dynamic (i.e., is updated as a result of the computation). If the data is static (e.g., training data), then the user could use an archival store on one cloud (such as AWS’s Glacier), which is significantly cheaper than blob stores, and then import the data they want to process to the blob store of any other cloud where they want the computation to run.

此外,针对作业迁移的阻力在数据是动态的情况下最为明显(即,数据会因计算而更新)。如果数据是静态的(例如训练数据),那么用户可以在一个云上使用归档存储(如AWS的Glacier),这种方式比使用Blob存储便宜得多,然后将需要处理的数据导入到其他云上的Blob存储中进行计算。

To our knowledge, current pricing policies for exporting data are independent of the cloud the data might be going to. One alternative that we have not seen explored to date is for clouds to enter into reciprocal data peering arrangements, where they agree to allow free exporting of data to each other, and to connect with high-speed links (presumably at PoPs where they both have a presence). This would make data transfers both fast and free, lowering the data gravity between two peering clouds and enabling greater freedom in job movement. As we argue below, this may solve some of the underlying incentive problems inherent in creating the compatibility and intercloud layers.

据我们所知,当前导出数据的定价政策与数据可能传输到的云无关。一种我们尚未见到被探索的替代方案是,云服务提供商之间达成互惠的数据对等安排,允许彼此免费导出数据,并通过高速链路连接(假设在他们都存在的PoP点上)。这将使数据传输既快速又免费,降低两对等云之间的数据引力,并使作业迁移更加自由。正如我们在下文中所论述的,这可能解决了创建兼容性和Intercloud层时固有的一些激励问题。

???+ note"PoP"

Text Only
1
2
3
PoP(Point of Presence,接入点)是网络中的物理位置,通常是数据中心或交换设施,网络服务提供商在此处部署硬件和基础设施,以便连接到其他网络或用户。PoP是互联网和其他网络服务提供商之间的交汇点,允许不同网络之间进行数据交换和流量传输。

云服务提供商可以通过PoP点使用高速链路互联,彼此传输数据,从而加快数据传输速度并降低成本。这是实现云之间“数据对等”(data peering)的关键基础设施。