引用:
原帖由 Puff 於 2012-4-15 20:08 發表
Different HW characteristics 就已經注定左 GPU/CU 唔會同 CPU integrate.
Bulldozer 既 Modular Design 同 GPU 都無乜關係... 係關 area/power efficiency & throughput maximizing 既事。
Decoupled INT/FP 都唔 ...
問題係你果堆code對CPU / GPU loading唔同
CPU: check cache hieracy and conflicts > load to L1 cache > execute > write to memory
GPU: load to cache > execute > write to memory
decouple CPU同GPU queue係AMD一貫做法, 但唔係而家的unified scheduler