Why Parallelism? Why Efficiency?¶

第一章，纯纯引入parallel computing的，但是其中有很多“体现辩证性”的思考，我们简单复盘一下:

系统设计的 trade-off¶

(1) Parallel 的一般步骤:

Decomposing work into pieces that can safely be performed in parallel
Assigning work to processors
Managing communication/synchronization between the processors
1. so that it does not limit speedup

(2) 评价一个设计/系统的指标？考察因素:

(3) Fast != Efficient:

比如, 对于 Is 2x speedup on computer with 10 processors a good result? 这个问题

一般来说, 一眼 not good, 理由很简单, 我花了10倍的开销, 却只实现2倍的性能增益. 简直太差了!
有些情况下, 我们认为它是 good enough 的:
1. 比如 10x processors 金钱成本很低, 这个 2x speedup 很宝贵 (比如google网页的查询返回, 2x的优化就很厉害了)

(1) 灵魂拷问: What is a computer program?

说实话这个问题我听课时, 点了暂停, 想了好几分钟, 却依旧没有头绪

Answer: A program is just a list of processor instructions!

alt text

高级语言经过compiler之类的处理后, 变成机器可以识别的机器码, 随后执行...

因此, 一个 program 本质上就是一些 “可被处理器识别的指令”

(2) 灵魂拷问: What does a processor do?

Answer1: A processor executes instructions

上面这个回答纯搞笑, 没有向下挖掘

Answer2: modifies the computer’s state! (by instructions)

alt text

太对了! 本质上就是通过“指令的指导”, 将 registers / memory 等的状态发生改变 (如: 存储的数值)

(3) 灵魂拷问: What do I mean when I talk about a computer’s “state” ?

Answer: values of program data, which are stored in a processor’s registers or in memory!

跟 (2) 提及的是一样的, 感觉跟之前国内上课提到的 “一台电脑本质上就是一个状态机” 相呼应

~~只是我们在国内从来不会说“为什么” 😅~~

(1) 一个例子: Superscalar Processor (超标量处理器) Execution

alt text

ILP: Instruction-Level Parallelism
- "指令级并行" 等级
在这个例子里, 整个“并行”的全过程, 对高级语言和程序员而言是“无感”的
- Superscalar execution: processor automatically finds * independent instructions in an instruction sequence and executes them in parallel on multiple execution units !
Superscalar Processor: 以一个2级的为例
指令之间的依赖关系本质上会形成一个 Instruction Dependency Graph:

(2) 单核已死: single-instruction stream performance is dead

alt text

原因:

What We Prefer Currently: