Open
Description
With struct OffsetTable { float table[32] }
we recently had to do the following optimization to reduce VGPR usage drastically in one of our kernels. The optimization distinctly feels like something scalar replacement should've taken care of (instead of loading the 32 floats into registers & then selecting a register).
From looking at it, the code in scalar_replace.cpp
seems to be able to operate on array's as well, but it misses this case.
In this particular case this resulted in a save of ~32 VGPRs because we could elide the table in its entirety.
Code uses HLSL & our own internal abstractions over buffer loads and stores to accommodate bindless
loadUniform
maps to:
ByteAddressBuffer buffer = ResourceDescriptorHeap[internalHandle];
return buffer.Load<T>(index * sizeof(T));
Metadata
Assignees
Labels
No labels
Activity