Vec-to-Cube Pattern
【免费下载链接】cannbot-skillsCANNBot 是面向 CANN 开发的用于提升开发效率的系列智能体,本仓库为其提供可复用的 Skills 模块。项目地址: https://gitcode.com/cann/cannbot-skills
Generic baseline only. For a2 (b3) kernels, prefer the a2-specific patterns under
agent/references/patterns/(e.g.,a2-cube-vec.md) and readagent/references/constraints/a2-device.mdfor device-side rules.
Read this file when vec work preprocesses data before cube consumes it in a later matmul stage.
Use this pattern when
- the formula needs elementwise or row-wise preprocessing first
- the cube stage should consume the transformed result
- the host-side contract should stay reshape-only instead of doing a heavy layout transform outside the kernel
Minimal flow
GM -> UB -> @vf -> UB -> L1 -> L0 -> L0C -> GM
Ownership rule
The vec-to-cube publish is a cross-side ownership edge. Use explicitVcMutex. Do not expectauto_sync()to replace it.
Stable repository mapping:
VcMutex(..., src_end_pipe=Pipe.MTE3, dst_end_pipe=Pipe.FIX)
What usually matters most
- whether the publish path is ND or NZ
- whether the host-side layout stays reshape-only
- how subblock rows are split between vec sides
- whether the preprocessed value must remain in half or float before cube consume
Typical files to study
agent/example/kernels/a5/vec_cube_abs_sqrt_matmul.pyagent/example/kernels/a5/vec_cube_abs_sqrt_matmul_nz.pyagent/example/kernels/a5/recompute_wu_cube_vec.py
【免费下载链接】cannbot-skillsCANNBot 是面向 CANN 开发的用于提升开发效率的系列智能体,本仓库为其提供可复用的 Skills 模块。项目地址: https://gitcode.com/cann/cannbot-skills
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考