打印

[硬件] What comes after Piledriver?

引用:
原帖由 Puff 於 2012-4-15 21:34 發表

How would you define a GPU scheduler?


Command processor

TOP

引用:
原帖由 qcmadness 於 2012-4-15 21:36 發表
Command processor
... 你真係覺得可以?
Compute Units 本身就係一個 4- to 40-way something-like SMT 既 "Core" 黎架啦喎。

TOP

引用:
原帖由 Puff 於 2012-4-15 21:39 發表

... 你真係覺得可以?
Compute Units 本身就係一個 4- to 40-way SMT 既 "Core" 黎架啦喎。
可以, 但係其實VLIW仲好做

TOP

引用:
原帖由 qcmadness 於 2012-4-15 21:40 發表

可以, 但係其實VLIW仲好做
點做?
每個 Compute Unit 有自已既 Program Counter,可以跑多過一個 kernel program。用呢個龐然大物黎取代 FPU,咁 CPU vector instructions 係 decode 之後要做乜?

TOP

引用:
原帖由 Puff 於 2012-4-15 21:42 發表

點做?
每個 Compute Unit 有自已既 Program Counter,可以跑多過一個 kernel program。用呢個龐然大物黎取代 FPU,咁 CPU vector instructions 係 decode 之後要做乜?
一組VLIW-4 shader已經可以取代4個core FPU, 你估邊樣efficient d?

HD6970 / 24 => 113 GFlops / shader (800MHz)
SandyBridge 2600K => 8 FP * 3.4 x 10^9 * 4 cores = 108.8 GFLOPs (3400MHz)

TOP

引用:
原帖由 qcmadness 於 2012-4-15 21:44 發表

一組VLIW-4 shader已經可以取代4個core FPU, 你估邊樣efficient d?
...... 我真係想講兩樣唔同野黎架喎,Efficient 係唔同既地方。

16 VLIW-4 Shaders 夾埋既 SIMD Engine 自己本身已經係一個 Fetch, Decode & Execute 既 Core.
CPU 既 FPU... 係 Out-of-order Scheduler + Execution Pipes + Backend 既一個 Unit.

前者係 DLP + TLP, 後者係 ILP + Latency-optimized...

[ 本帖最後由 Puff 於 2012-4-15 21:47 編輯 ]

TOP

引用:
原帖由 Puff 於 2012-4-15 21:45 發表

...... 我真係想講兩樣唔同野黎架喎,Efficient 係唔同既地方。

16 VLIW-4 Shaders 夾埋既 SIMD Engine 自己本身已經係一個 Fetch, Decode & Execute 既 Core.
CPU 既 FPU... 係 Out-of-order Scheduler + Execution ...
所以要時間tune到2者融合

唔係Intel/AMD一早做到出來啦

TOP

引用:
原帖由 qcmadness 於 2012-4-15 21:49 發表


所以要時間tune到2者融合

唔係Intel/AMD一早做到出來啦
咁你講晒姐,我會歸類做 idealistic 囉。
再講,點樣將 CPU 既 FPU instructions translate 做 GPU 既 executable? 如果係 vector + GPR instruction mix 又點?

TOP

引用:
原帖由 Puff 於 2012-4-15 21:51 發表

咁你講晒姐,我會歸類做 idealistic 囉。
再講,點樣將 CPU 既 FPU instructions translate 做 GPU 既 executable? 如果係 vector + GPR instruction mix 又點?
如果係idealistic, AMD就唔會買ATi, finish

都係果句, 要translate唔難, 但係做到hieracy之類仲難, 因為x86既restriction/memory disambiguation多好多

TOP

引用:
原帖由 qcmadness 於 2012-4-15 21:52 發表

如果係idealistic, AMD就唔會買ATi, finish
AMD 買 ATi 都唔代表要整 D 乜鬼 CPU fused together with GPU and they become SuperPU muhahahahah.
最後既問題就係 what's the point of doing this. GPU 同 CPU 既 design aim 根本就唔同。

我覺得成件事要從 hardware pipeline 解得通先有 possibility 囉。

TOP

引用:
原帖由 Puff 於 2012-4-15 21:54 發表

AMD 買 ATi 都唔代表要整 D 乜鬼 CPU fused together with GPU and they become SuperPU muhahahahah.
最後既問題就係 what's the point of doing this. GPU 同 CPU 既 design aim 根本就唔同。

我覺得成件事要從  ...
所以要時間lor, 唔係一早2011年就出左hetergeneous computing啦

TOP

引用:
原帖由 qcmadness 於 2012-4-15 21:54 發表

所以要時間lor, 唔係一早2011年就出左hetergeneous computing啦
唔關時間既問題事,而係你出發既角度就係 CPU 同 GPU 最終會 Tightly Fuse 埋一齊。
但事實上無人咁講過。

Core-level integration 果種喎。

[ 本帖最後由 Puff 於 2012-4-15 21:56 編輯 ]

TOP

引用:
原帖由 Puff 於 2012-4-15 21:55 發表

唔關時間既問題事,而係你出發既角度就係 CPU 同 GPU 最終會 Tightly Fuse 埋一齊。
但事實上無人咁講過

TOP

引用:
原帖由 qcmadness 於 2012-4-15 21:59 發表


Core-level Integration. 果種只係 CMP. 依家既 APU 咪係。Torrenza 都瓜左啦。

TOP

引用:
原帖由 Puff 於 2012-4-15 21:55 發表

唔關時間既問題事,而係你出發既角度就係 CPU 同 GPU 最終會 Tightly Fuse 埋一齊。
但事實上無人咁講過。

Core-level integration 果種喎。
http://www.xbitlabs.com/news/cpu ... ue_in_2015_AMD.html
引用:
Advanced Micro Devices plans to finally launch its hybrid chips – which feature x86 central processing along with graphics processing cores – code-named Fusion in early 2011, however, according to a vice president of AMD, the second iteration of Fusion processors will not only be heterogeneous in terms of different cores within one piece of silicon, but the cores themselves will process both graphics and general-purpose data.

“The first iteration of Fusion will include a CPU and GPU, but by 2015 the model could change. In the second iteration [in] 2015, you are not going to be able to tell the difference. It's all going away," said Leslie Sobon, vice president of marketing at AMD, reports IDG News agency.

TOP