GPU-AV debugprintf fails when DEVICE_LOCAL | HOST_COHERENT | HOST_VISIBLE heap is full

**Environment:**
 - OS: Windows
 - GPU and driver version: GTX 5080 571.32
 - SDK or header version if building from repo: ff2042e742e676f1c12054695df6e5c0167124f8
 - Options enabled (synchronization, best practices, etc.): VK_VALIDATION_FEATURE_ENABLE_DEBUG_PRINTF_EXT

**Describe the Issue**

When I enable VK_VALIDATION_FEATURE_ENABLE_DEBUG_PRINTF_EXT in our engine I get the following error from validation and then debugprintf does not work:

```
VK ERROR : VALIDATION - Message Id Number: -1841738615 | Message Id Name: UNASSIGNED-DEBUG-PRINTF
    Validation Error: [ UNASSIGNED-DEBUG-PRINTF ] | MessageID = 0x92394c89 | vkAllocateCommandBuffers(): Internal VMA Error, DebugPrintf is being disabled. Details:
Unable to allocate device memory for internal buffer. VMA statistics = {
  "General": {
    "API": "Vulkan", 
    "apiVersion": "1.4.303", 
    "GPU": "NVIDIA Graphics Device", 
    "deviceType": 2, 
    "maxMemoryAllocationCount": 4294967295, 
    "bufferImageGranularity": 1024, 
    "nonCoherentAtomSize": 64, 
    "memoryHeapCount": 3, 
    "memoryTypeCount": 5
  }, 
  "Total": {
    "BlockCount": 210, 
    "BlockBytes": 208569936, 
    "AllocationCount": 910, 
    "AllocationBytes": 135208048, 
    "UnusedRangeCount": 92, 
    "AllocationSizeMin": 32768, 
    "AllocationSizeMax": 1048576, 
    "UnusedRangeSizeMin": 32768, 
    "UnusedRangeSizeMax": 16024432
  }, 
  "MemoryInfo": {
    "Heap 0": {
      "Flags": ["DEVICE_LOCAL"], 
      "Size": 16660824064, 
      "Budget": {
        "BudgetBytes": 13328659251, 
        "UsageBytes": 0
      }, 
      "Stats": {
        "BlockCount": 0, 
        "BlockBytes": 0, 
        "AllocationCount": 0, 
        "AllocationBytes": 0, 
        "UnusedRangeCount": 0
      }, 
      "MemoryPools": {
        "Type 1": {
          "Flags": ["DEVICE_LOCAL"], 
          "Stats": {
            "BlockCount": 0, 
            "BlockBytes": 0, 
            "AllocationCount": 0, 
            "AllocationBytes": 0, 
            "UnusedRangeCount": 0
          }
        }
      }
    }, 
    "Heap 1": {
      "Flags": [], 
      "Size": 17138475008, 
      "Budget": {
        "BudgetBytes": 13710780006, 
        "UsageBytes": 0
      }, 
      "Stats": {
        "BlockCount": 0, 
        "BlockBytes": 0, 
        "AllocationCount": 0, 
        "AllocationBytes": 0, 
        "UnusedRangeCount": 0
      }, 
      "MemoryPools": {
        "Type 0": {
          "Flags": [], 
          "Stats": {
            "BlockCount": 0,
```

I debugged this and the problem is that VVL is creating a custom pool for VMA with preferredFlags set as follows:

```
 error_buffer_alloc_ci.preferredFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT;
```

Since VVL is using a custom VMA pool, this will attempt to create a pool in DEVICE_LOCAL | HOST_COHERENT | HOST_VISIBLE memory.  On some GPUs (such as NV and AMD) this is a very limited resource, i.e. on my NV GPU it is only 214MB:

```
memoryHeaps: count = 3
	memoryHeaps[0]:
		size   = 16660824064 (0x3e1100000) (15.52 GiB)
		budget = 15855517696 (0x3b1100000) (14.77 GiB)
		usage  = 0 (0x00000000) (0.00 B)
		flags: count = 1
			MEMORY_HEAP_DEVICE_LOCAL_BIT
	memoryHeaps[1]:
		size   = 17138475008 (0x3fd886000) (15.96 GiB)
		budget = 16333170688 (0x3cd886800) (15.21 GiB)
		usage  = 0 (0x00000000) (0.00 B)
		flags:
			None
	memoryHeaps[2]:
		size   = 224395264 (0x0d600000) (214.00 MiB)
		budget = 0 (0x00000000) (0.00 B)
		usage  = 224395264 (0x0d600000) (214.00 MiB)
		flags: count = 1
			MEMORY_HEAP_DEVICE_LOCAL_BIT
```

There are two possible solutions here.  One is to just remove that flag so the pool does not land in that heap, i.e. this fixes it for me:

```
diff --git a/layers/gpuav/core/gpuav_setup.cpp b/layers/gpuav/core/gpuav_setup.cpp
index 9c82b3bef..b9e186140 100644
--- a/layers/gpuav/core/gpuav_setup.cpp
+++ b/layers/gpuav/core/gpuav_setup.cpp
@@ -507,7 +507,7 @@ void Validator::PostCreateDevice(const VkDeviceCreateInfo *pCreateInfo, const Lo
         error_buffer_ci.usage = VK_BUFFER_USAGE_STORAGE_BUFFER_BIT;
         VmaAllocationCreateInfo error_buffer_alloc_ci = {};
         error_buffer_alloc_ci.requiredFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT;
-        error_buffer_alloc_ci.preferredFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT;
+        error_buffer_alloc_ci.preferredFlags = 0;
         uint32_t mem_type_index;
         result = vmaFindMemoryTypeIndexForBufferInfo(vma_allocator_, &error_buffer_ci, &error_buffer_alloc_ci, &mem_type_index);
         if (result != VK_SUCCESS) {
```

The other solution would be to not use custom pools, then VMA will fallback on its own.  Generally it is not recommended to use custom pools with VMA unless you have a strong reason to.  I'm not sure what the reasoning is here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU-AV debugprintf fails when DEVICE_LOCAL | HOST_COHERENT | HOST_VISIBLE heap is full #9499

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development