跳转至

Why Parallelism? Why Efficiency?

第一章,纯纯引入parallel computing的,但是其中有很多“体现辩证性”的思考,我们简单复盘一下:

系统设计的 trade-off

(1) Parallel 的一般步骤:

  1. Decomposing work into pieces that can safely be performed in parallel
  2. Assigning work to processors
  3. Managing communication/synchronization between the processors
    1. so that it does not limit speedup

(2) 评价一个设计/系统的指标?考察因素:

  1. performance
  2. convenience
  3. cost

(3) Fast != Efficient:

比如, 对于 Is 2x speedup on computer with 10 processors a good result? 这个问题

  1. 一般来说, 一眼 not good, 理由很简单, 我花了10倍的开销, 却只实现2倍的性能增益. 简直太差了!
  2. 有些情况下, 我们认为它是 good enough 的:
    1. 比如 10x processors 金钱成本很低, 这个 2x speedup 很宝贵 (比如google网页的查询返回, 2x的优化就很厉害了)

程序 - 指令 - 处理器 - 状态

(1) 灵魂拷问: What is a computer program?

说实话这个问题我听课时, 点了暂停, 想了好几分钟, 却依旧没有头绪

Answer: A program is just a list of processor instructions!

alt text

高级语言经过compiler之类的处理后, 变成机器可以识别的机器码, 随后执行...

因此, 一个 program 本质上就是一些 “可被处理器识别的指令”

(2) 灵魂拷问: What does a processor do?

Answer1: A processor executes instructions

上面这个回答纯搞笑, 没有向下挖掘

Answer2: modifies the computer’s state! (by instructions)

alt text

太对了! 本质上就是通过“指令的指导”, 将 registers / memory 等的状态 发生改变 (如: 存储的数值)

(3) 灵魂拷问: What do I mean when I talk about a computer’s “state” ?

Answer: values of program data, which are stored in a processor’s registers or in memory!

跟 (2) 提及的是一样的, 感觉跟之前国内上课提到的 “一台电脑本质上就是一个状态机” 相呼应

只是我们在国内从来不会说“为什么” 😅

Superscalar Processor

(1) 一个例子: Superscalar Processor (超标量处理器) Execution

alt text

  1. ILP: Instruction-Level Parallelism
    • "指令级并行" 等级
  2. 在这个例子里, 整个“并行”的全过程, 对高级语言和程序员而言是“无感”的
    • Superscalar execution: processor automatically finds * independent instructions in an instruction sequence and executes them in parallel on multiple execution units !
  3. Superscalar Processor: 以一个2级的为例 alt text
  4. 指令之间的依赖关系本质上会形成一个 Instruction Dependency Graph: alt text

(2) 单核已死: single-instruction stream performance is dead

alt text

原因:

  1. 功率受限 -> 晶体管数量受限 -> 单核的性能受限 (比如, 一个芯片只能装“受限数量”的晶体管)
  2. ILP 扩展性萎缩

What We Prefer Currently:

  1. faster processors <-- more execution units running in parallel
  2. units that are specialized for a specific task (graphics ...)