jax.experimental.pallas.mosaic_gpu.GPUCompilerParams

jax.experimental.pallas.mosaic_gpu.GPUCompilerParams#

class jax.experimental.pallas.mosaic_gpu.GPUCompilerParams(*, approx_math=False, dimension_semantics=None, max_concurrent_steps=1, delay_release=0, profile_space=0, profile_dir='')[原始碼]#

Mosaic GPU 編譯器參數。

參數:

approx_math (bool)
dimension_semantics (Sequence[DimensionSemantics] | None)
max_concurrent_steps (int)
delay_release (int)
profile_space (int)
profile_dir (str)

approx_math#

如果為 True，編譯器可以對某些數學運算使用近似實作，例如 exp。預設為 False。

型別:: bool

dimension_semantics#

核心每個網格維度的維度語意列表。「平行」表示可以在任何順序執行的維度，「循序」表示必須循序執行的維度。

型別:: Sequence[DimensionSemantics] | None

max_concurrent_steps#

同時處於活動狀態的循序階段的最大數量。預設值為 1。

型別:: int

delay_release#

在重複使用輸入/輸出參考之前要等待的步驟數。預設值為 0，且必須嚴格小於 max_concurrent_steps。一般來說，如果不在主體中等待 WGMMA，您會想要將其設定為 1。

型別:: int

profile_space#

單次調用中可以收集的效能分析器事件數量。如果執行緒收集的事件超過此數量，則行為未定義。

型別:: int

profile_dir#

將在其中寫入效能分析追蹤的目錄。

型別:: str

__init__(*, approx_math=False, dimension_semantics=None, max_concurrent_steps=1, delay_release=0, profile_space=0, profile_dir='')#

參數:

approx_math (bool)
dimension_semantics (Sequence[DimensionSemantics] | None | None)
max_concurrent_steps (int)
delay_release (int)
profile_space (int)
profile_dir (str)

返回型別:

None

方法

__init__(*[, approx_math, ...])

屬性

`PLATFORM`
`approx_math`
`delay_release`
`dimension_semantics`
`max_concurrent_steps`
`profile_dir`
`profile_space`