Format Version: 2.0
Document Revision: 4
Editor: Mark Callow (Edgewise Consulting)
Copyright 2020-2022 The Khronos Group Inc.
KTX™️ (Khronos TeXture) is an efficient lightweight container file format for reliably distributing GPU textures to diverse platforms and applications. It is distinguished by the simplicity of the loader required to instantiate texture objects from the file contents. The contents of a KTX file can range from a simple base-level 2D texture to a cubemap array texture with mipmaps. KTX files hold all the parameters needed for efficient texture loading into 3D APIs such as OpenGL® and Vulkan®
Version 2 extends the functionality of version 1 with easier loading of Vulkan textures, easier use by non-OpenGL and non-Vulkan applications, the possibility of streaming, through sending small mip levels first, universal textures using Basis Universal technology and supercompression. Providing this new functionality requires a significantly different file structure from version 1.
KTX 2.0 ratified by the Khronos Board of Promoters Aug 14th, 2020.
Document Revision 1 approved by the 3D Formats WG Dec 7th, 2022.
Document Revision 2 approved by the 3D Formats WG Sep 6th, 2023.
Document Revision 3 approved by the 3D Formats WG Feb 14th, 2024.
Document Revision 4 approved by the 3D Formats WG Feb 19th, 2025.
This document describes the KTX™️ file format version 2.0, hereafter KTX, unless disambiguation is necessary. KTX files are used for storing textures for use with GPU APIs such as OpenGL®, OpenGL ES™️, Vulkan® and WebGL™️.
The canonical version of the specification is available in the Khronos Registry (https://registry.khronos.org/KTX). The source files used to generate the specification are stored in the KTX-Specification Repository (https://github.com/KhronosGroup/KTX-Specification). The source repository has a public issue tracker and allows the submission of pull requests that improve the specification.
KTX files can contain almost any of the wide variety of image formats supported by GPUs. Other specifications wishing to refer to KTX as a container may wish to restrict the range of image formats or other items that can be used. Such referrers must establish a way to identify that given KTX files are compliant with their subsets such as by adding a metadata item.
The KTX specification is intended for use by both creators and consumers of KTX files forming a contract between these parties. Specification text may address either party; typically the intended audience can be inferred from context
Within this specification, the key words must, required, should, recommended, may and optional are to be interpreted as described in Key words for use in RFCs to Indicate Requirement Levels [RFC2119]. In text addressing creators, their use expresses requirements that apply to the files produced. In text addressing consumers, their use expresses requirements that must be followed when, e.g, uploading the textures via a 3D API.
Note
|
Notes are non-normative and give further background information such as rationales. |
Tip
|
Tips are non-normative and give helpful suggestions for implementers or users. |
Important
|
Importants are normative and give directions for implementers. |
Caution
|
Cautions are normative and give restrictions that must be followed. |
Byte[12] identifier
UInt32 vkFormat
UInt32 typeSize
UInt32 pixelWidth
UInt32 pixelHeight
UInt32 pixelDepth
UInt32 layerCount
UInt32 faceCount
UInt32 levelCount
UInt32 supercompressionScheme
// Index (1)
UInt32 dfdByteOffset
UInt32 dfdByteLength
UInt32 kvdByteOffset
UInt32 kvdByteLength
UInt64 sgdByteOffset
UInt64 sgdByteLength
// Level Index (2)
struct {
UInt64 byteOffset
UInt64 byteLength
UInt64 uncompressedByteLength
} levels[max(1, levelCount)]
// Data Format Descriptor (3)
UInt32 dfdTotalSize
continue
dfDescriptorBlock dfdBlock
︙
until dfdTotalSize read
// Key/Value Data (4)
continue
UInt32 keyAndValueByteLength
Byte keyAndValue[keyAndValueByteLength]
align(4) valuePadding (5)
︙
until kvdByteLength read
if (sgdByteLength > 0)
align(8) sgdPadding
// Supercompression Global Data (6)
Byte supercompressionGlobalData[sgdByteLength]
// Mip Level Array (7)
for each mip_level in levelCount (8)
Byte levelImages[bytesOfLevelImages] (9)
end
-
Required. See Section 3.9, “Index”.
-
Required. See Section 3.9.7, “Level Index”.
-
Required. See Section 3.10, “Data Format Descriptor”.
-
Not required. See Section 3.11, “Key/Value Data”.
-
align(n)
is pseudo function that inserts the minimum number of 0-filled bytes of padding required to align the following item on an n-byte boundary. where n is the function parameter. -
Not required. See Section 3.12, “Supercompression Global Data”.
-
Required. See Section 3.13, “Mip Level Array”.
-
Replace with 1 if
levelCount
is 0 -
See the levelImages structure below.
After inflation from supercompression or when supercompressionScheme ==
0
, levelImages
looks like the following:
Note
|
Mip levels are supercompressed independently so do not contain
mipPadding. Applications inflating levels may choose to restore the
alignment caused by mipPadding .
|
align( lcm(texel_block_size, 4) ) mipPadding (1) (2)
for each layer in max(1, layerCount)
for each face in faceCount
for each z_slice_of_blocks in num_blocks_z (3)
for each row_of_blocks in num_blocks_y (3)
for each block in num_blocks_x (3)
Byte data[format_specific_number_of_bytes] (4)
end
end
end
end
end
-
\(\operatorname{lcm}\) is least common multiple.
-
See the definitions below.
-
Rows of uncompressed texture images must be tightly packed, equivalent to a
GL_UNPACK_ALIGNMENT
of 1.
where \(p\) is the level index (see Section 3.7, “levelCount”) and \(\textit{block_depth}\), \(\textit{block_height}\) and \(\textit{block_width}\) are \(1\) for uncompressed formats and the block size in that dimension for block compressed formats as given in the format’s section of the Khronos Data Format specification [KDF14].
A block is a single pixel for uncompressed formats and \(\textit{block_width} \times \textit{block_height} \times \textit{block_depth}\) pixels for block compressed formats.
For formats whose Vulkan names have _422_
, \(\textit{block_depth}\) and
\(\textit{block_height}\) are \(1\), and \(\textit{block_width}\) is
\(2\).
The file identifier is a unique set of bytes that will differentiate the file from other types of files. It consists of 12 bytes, as follows:
Byte[12] FileIdentifier = {
0xAB, 0x4B, 0x54, 0x58, 0x20, 0x32, 0x30, 0xBB, 0x0D, 0x0A, 0x1A, 0x0A
}
This can also be expressed using C-style character definitions as:
Byte[12] FileIdentifier = {
'«', 'K', 'T', 'X', ' ', '2', '0', '»', '\r', '\n', '\x1A', '\n'
}
The rationale behind the choice of values in the identifier is based on the rationale for the identifier in the PNG specification. This identifier both identifies the file as a KTX version 2 file and provides for immediate detection of common file-transfer problems.
-
Byte [0] is chosen as a non-ASCII value to reduce the probability that a text file may be misrecognized as a KTX file.
-
Byte [0] also catches bad file transfers that clear bit 7.
-
Bytes [1..6] identify the format, and are the ASCII values for the string “KTX 20”.
-
Byte [7] is for aesthetic balance with byte [0] (they are a matching pair of double-angle quotation marks).
-
Bytes [8..9] form a CR-LF sequence which catches bad file transfers that alter newline sequences.
-
Byte [10] is a control-Z character, which stops file display under MS-DOS, and further reduces the chance that a text file will be falsely recognized.
-
Byte [11] is a final line feed, which checks for the inverse of the CR-LF translation problem.
vkFormat
specifies the image format using Vulkan VkFormat
enum
values. It can be any value defined in the Vulkan specification
[VULKAN], future core versions or registered Vulkan extensions,
except for values listed in Table 1, “Prohibited Formats” and any *SCALED*
or *[2-9]PLANE*
formats added in future. Values defined by the
latest core Vulkan specification and registered extensions are given
in Formats on [VULKAN-DOCS].
Use of the value VK_FORMAT_UNDEFINED
(0) is only permissible when
the format of the data is a not a recognized Vulkan format, such
as in the case of the universal texture formats. In this case
information about the format must be provided by the Data Format
Descriptor and, in cases where the format is known to another GPU
API, the KTX writer must include one or more of the metadata items
described in Section 5.3, “Format Mapping”. Some permissible uses are
outlined within this specification and summarized in
Section 4.2, “Use of VK_FORMAT_UNDEFINED
”.
The table in Appendix B, Mapping of vkFormat
values gives the mapping for all VkFormat
enum values in Formats at
the time of writing, to the equivalent OpenGL format (internal
format, format and type values), DXGI_FORMAT and MTLPixelFormat.
Applications must use these mappings. If Appendix B, Mapping of vkFormat
values does not
have an entry for the value of vkFormat
and a mapping to one or
more of the other APIs exists then, even if the value is not
VK_FORMAT_UNDEFINED
, the KTX writer must provide that mapping
using one or more of the metadata items described in
Section 5.3, “Format Mapping”.
Tip
|
Before loading any image, Vulkan loaders should confirm via
Vulkan applications using a core Vulkan format whose name has the
Vulkan applications handling textures whose formats are not known at
|
Note
|
Packed A8B8G8R8 Formats
The |
Format Name | Value |
---|---|
VK_FORMAT_R8_USCALED |
11 |
VK_FORMAT_R8_SSCALED |
12 |
VK_FORMAT_R8G8_USCALED |
18 |
VK_FORMAT_R8G8_SSCALED |
19 |
VK_FORMAT_R8G8B8_USCALED |
25 |
VK_FORMAT_R8G8B8_SSCALED |
26 |
VK_FORMAT_B8G8R8_USCALED |
32 |
VK_FORMAT_B8G8R8_SSCALED |
33 |
VK_FORMAT_R8G8B8A8_USCALED |
39 |
VK_FORMAT_R8G8B8A8_SSCALED |
40 |
VK_FORMAT_B8G8R8A8_USCALED |
46 |
VK_FORMAT_B8G8R8A8_SSCALED |
47 |
VK_FORMAT_A8B8G8R8_USCALED_PACK32 |
53 |
VK_FORMAT_A8B8G8R8_SSCALED_PACK32 |
54 |
VK_FORMAT_A2R10G10B10_USCALED_PACK32 |
60 |
VK_FORMAT_A2R10G10B10_SSCALED_PACK32 |
61 |
VK_FORMAT_A2B10G10R10_USCALED_PACK32 |
66 |
VK_FORMAT_A2B10G10R10_SSCALED_PACK32 |
67 |
VK_FORMAT_R16_USCALED |
72 |
VK_FORMAT_R16_SSCALED |
73 |
VK_FORMAT_R16G16_USCALED |
79 |
VK_FORMAT_R16G16_SSCALED |
80 |
VK_FORMAT_R16G16B16_USCALED |
86 |
VK_FORMAT_R16G16B16_SSCALED |
87 |
VK_FORMAT_R16G16B16A16_USCALED |
93 |
VK_FORMAT_R16G16B16A16_SSCALED |
94 |
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM |
1000156002 |
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM |
1000156003 |
VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM |
1000156004 |
VK_FORMAT_G8_B8R8_2PLANE_422_UNORM |
1000156005 |
VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM |
1000156006 |
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_420_UNORM_3PACK16 |
1000156012 |
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16 |
1000156013 |
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16 |
1000156014 |
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_422_UNORM_3PACK16 |
1000156015 |
VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_444_UNORM_3PACK16 |
1000156016 |
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_420_UNORM_3PACK16 |
1000156022 |
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_420_UNORM_3PACK16 |
1000156023 |
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_422_UNORM_3PACK16 |
1000156024 |
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_422_UNORM_3PACK16 |
1000156025 |
VK_FORMAT_G12X4_B12X4_R12X4_3PLANE_444_UNORM_3PACK16 |
1000156026 |
VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM |
1000156029 |
VK_FORMAT_G16_B16R16_2PLANE_420_UNORM |
1000156030 |
VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM |
1000156031 |
VK_FORMAT_G16_B16R16_2PLANE_422_UNORM |
1000156032 |
VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM |
1000156033 |
VK_FORMAT_G8_B8R8_2PLANE_444_UNORM |
1000330000 |
VK_FORMAT_G10X6_B10X6R10X6_2PLANE_444_UNORM_3PACK16 |
1000330001 |
VK_FORMAT_G12X4_B12X4R12X4_2PLANE_444_UNORM_3PACK16 |
1000330002 |
VK_FORMAT_G16_B16R16_2PLANE_444_UNORM |
1000330003 |
Note
|
Rationale
The *SCALED* formats are prohibited because they are intended for
vertex data, very few, if any, implementations support using them
for texturing and a Data Format Descriptor cannot distinguish
these from The *[2-9]PLANE* formats are prohibited because multiplanar formats are not supported. |
Caution
|
Legacy Formats
The legacy OpenGL & OpenGL ES formats specified by the following extensions, do not have equivalent Vulkan formats and are not supported.
Only a few of these formats can be described without an extended
Data Format Descriptor so This is felt to be an acceptable trade-off for simplifying this specification as the formats are not in wide use and applications needing them can use KTX version 1. |
Despite Vulkan requiring separate uploads of depth and stencil components, combined depth/stencil pixel formats can be used with KTX.
Note
|
Rationale
Other GPU APIs support combined uploads and given KTX data alignment it’s trivial to upload components separately in Vulkan. |
VK_FORMAT_D16_UNORM_S8_UINT
is defined as two 16-bit words per texel.
The first word contains the D16 value. The second word contains the S8
value in the eight LSBs and zeros in the eight MSBs.
VK_FORMAT_D24_UNORM_S8_UINT
is defined as one 32-bit word per texel
with the S8 value in the eight LSBs of the word and the D24 value in the MSBs.
Tip
|
This layout matches OpenGL’s |
VK_FORMAT_X8_D24_UNORM_PACK32
is defined as one 32-bit word per texel
with the D24 value in the LSBs of the word and zeros in the eight MSBs.
VK_FORMAT_D32_SFLOAT_S8_UINT
is defined as two 32-bit words per
texel. The first word contains the floating-point D32 value. The
second word contains the S8 value in the eight LSBs and zeros in
the MSBs.
VK_FORMAT_S8_UINT
, VK_FORMAT_D16_UNORM
and VK_FORMAT_D32_SFLOAT
are defined as in Formats
on [VULKAN-DOCS].
typeSize
specifies the size of the data type in bytes used to
upload the data to a graphics API. When typeSize
is greater than
1, software on big-endian systems must endian convert all image
data since it is little-endian. When format is VK_FORMAT_UNDEFINED
,
typeSize
must equal \(1\). For formats whose Vulkan names have
the suffix _BLOCK
it must equal \(1\). For formats with the
suffix _PACKxx
or _nPACKxx
it must equal the value of
\(\texttt{xx} / 8\). For unpacked formats, except combined
depth/stencil formats, it must equal the number of bytes needed for
a single component which can be derived from the format name. E.g.
for VK_FORMAT_R16G16B16_UNORM
it will be \(16 / 8\). This
means it will equal \(1\) for any format with 8-bit components.
For VK_FORMAT_D16_UNORM_S8_UINT
, using the layout defined in this
specification, the value will be \(2\) and for the other combined
depth/stencil formats the value will be \(4\).
Note
|
Rationale
Although |
The size of the texture image for level 0, in pixels.
These properties combined with faceCount
and
layerCount
determine the type of the texture as
understood by graphics APIs. See Section 4.1, “Texture Type” for more details.
pixelWidth
must not be 0.
If faceCount
is equal to 6, pixelHeight
must be equal to
pixelWidth
, and pixelDepth
must be 0.
pixelHeight
must not be 0 for block-compressed formats, including
BasisLZ/ETC1S and UASTC.
pixelDepth
must not be 0 for block-compressed formats that have
block depth greater than 1.
pixelDepth
must be 0 for depth or stencil formats.
Tip
|
While the KTX format does not impose any image size restrictions, beyond those above, producers of KTX files need to be aware that some APIs and formats have specific requirements including, but not limited to, the following:
|
layerCount
specifies the number of array elements. If
the texture is not an array texture, layerCount
must
equal 0.
Although current graphics APIs do not support 3D array textures, KTX files can be used to store them.
Refer to [_texture_type] for more details about valid values.
faceCount
specifies the number of cubemap faces. For cubemaps and
cubemap arrays this must be 6. For non cubemaps this must be 1.
Cubemap faces are stored in the order: +X, -X, +Y, -Y, +Z, -Z in a
left-handed coordinate system with +Y up and, with the +Z face
forward, +X on the on the right. All faces must have the same
orientation which must be rd
(top-left origin) which is assumed
in the absence of Section 5.2, “KTXorientation” metadata. See Appendix A, Cubemap Orientation
for details.
Applications wanting to store incomplete cubemaps should flatten faces into a 2D array and use the metadata described in Section 5.1, “KTXcubemapIncomplete” to signal which faces are present.
levelCount
specifies the number of levels in the Mip
Level Array and, by extension, the number of indices in the
Level Index
array. A KTX file does not need to
contain a complete mipmap pyramid. Mip level data is ordered from
the level with the smallest size images, \(\textit{level}_p\)
to that with the largest size images, \(\textit{level}_\textit{base}\)
where \(p = \texttt{levelCount} - 1\) and \(\textit{base} =
0\). \(\textit{level}_p\) must not be greater than the maximum
possible, \(\textit{level}_{M}\), where
\(\texttt{levelCount} = 1\) means that a file contains only the base level and the texture isn’t meant to have other levels. E.g., this could be a LUT rather than a natural image.
\(\texttt{levelCount} = 0\) is allowed, except for block-compressed formats, and means that a file contains only the base level and consumers, particularly loaders, should generate other levels if needed.
supercompressionScheme
indicates if a supercompression scheme has
been applied to the data in levelImages
. It
must be one of the unreserved values from Table 2, “Supercompression Schemes”
or Table 3, “Vendor Supercompression Schemes1”. A value of 0
indicates no
supercompression.
Scheme Id | Scheme Name | Level Data Format | Global Data Format |
---|---|---|---|
0 |
None |
n/a |
n/a |
1 |
BasisLZ |
||
2 |
Zstandard |
n/a |
|
3 |
ZLIB |
n/a |
|
4・・・0xffff |
Reserved1 |
||
0x10000・・・0x1ffff |
Reserved2 |
||
0x20000・・・0xffffffff |
Reserved3 |
-
Reserved for KTX use.
-
Reserved for vendor compression schemes. See Table 3, “Vendor Supercompression Schemes1”.
-
Reserved. Do not use.
The supercompression scheme is applied independently to each mip
level to permit streaming and random access to the levels. The
format of the data in levelImages
for a scheme
is specified in the reference given in the Level Data Format
column of Table 2, “Supercompression Schemes”.
Schemes that require data global to all levels can store it as
described in Section 3.12.1, “supercompressionGlobalData”
. Currently only BasisLZ
uses global data. The format of the global data for a scheme
is specified in the reference given in the Global Data Format
column of Table 2, “Supercompression Schemes”.
When a supercompression scheme is used, the image data must be inflated from the scheme prior to GPU sampling.
Tip
|
LZW-style lossless supercompression, e.g, scheme 2, is generally ineffective on the block-compressed data of GPU texture formats. It is best reserved for use with uncompressed texture formats or with block-compressed data that has been specially conditioned for LZW compression such as by Rate-distortion Optimization [RDO]. BasisLZ internally uses a universal block-compressed texture format and Rate-distortion Optimization. Encoding to the RDO-conditioned internal format is combined with supercompression. Therefore it is applicable only to uncompressed images. |
Scheme Id | Scheme Name | Token | Author | Contact | Level Data Format | Global Data Format |
---|---|---|---|---|---|---|
0x10000 |
Asobo |
KTX_SS_PROPRIETARY_ASOBO |
Asobo Studio |
Julien Vernay [send][email protected] |
Proprietary |
Required |
0x10001・・・0x1ffff |
Reserved2 |
-
For information on registering schemes see Section 4.3.4.2, “Supercompression Schemes”. Readers and writers may, but are not required, to support these schemes.
-
Reserved for schemes yet to come.
-
[ETC1S Slice Decoding] describes the bitstream for a single image (slice).
-
ETC1S slice locations within a mip level are defined exclusively by the corresponding
ImageDesc
structures from the [basislz_global_data_structure]. The same slice data may be used by multipleImageDesc
structures within the mip level. -
An image bitstream refers to the endpoint and selector codebooks described in in [basislz_gd].
-
vkFormat
must beVK_FORMAT_UNDEFINED
(0x00). The Data Format Descriptor must retain the pre-deflation color space information and indicate which color and alpha components are present. See Section 3.10.3.1, “DFD for Supercompressed Data”. -
levels[p].uncompressedByteLength
must be 0.
Note
|
Rationale
The BasisLZ encoder combines encoding to a universal format with deflation. The transcoder combines inflation back to the universal format with transcoding to one of the many GPU-specific block compressed formats. There is therefore no visible common pre- and post-supercompression format. The effective uncompressed byte length is dependent on the which transcode target format is selected. |
-
After inflation, the level data follows the uncompressed layout as specified in the levelImages structure.
-
Only Zstandard frames are required. Inflators may skip Skippable frames.
-
Checksums are optional. If a checksum is present, inflators should verify it.
-
vkFormat
must retain the pre-deflation value. The Data Format Descriptor must retain pre-deflation color space information and indicate which components are present. See Section 3.10.3.1, “DFD for Supercompressed Data”.
-
After inflation, the level data follows the uncompressed layout as specified in the levelImages structure.
-
With Deflate [RFC1951] compression scheme.
-
vkFormat
must retain the pre-deflation value. The Data Format Descriptor must retain pre-deflation color space information and indicate which components are present. See Section 3.10.3.1, “DFD for Supercompressed Data”.
-
vkFormat
must retain the pre-deflation value. The Data Format Descriptor must retain pre-deflation color space information and indicate which components are present. See Section 3.10.3.1, “DFD for Supercompressed Data”.
An index giving the byte offsets from the start of the file and byte sizes of the various sections of the KTX file.
The offset from the start of the file of the
dfdTotalSize
field of the
Data Format Descriptor.
The total number of bytes in the Data
Format Descriptor including the dfdTotalSize
field. dfdByteLength
must equal
dfdTotalSize
.
Note
|
This field is not necessary. Since no padding is needed for DFDs
the value is easily calculated from the offsets. However, if it is
removed, we would need 4 bytes of padding instead for proper alignment
of |
An arbitrary number of key/value pairs may
follow the Index. These can be used to encode any arbitrary data.
The kvdByteOffset
field gives the offset of this data, i.e.
that of first key/value pair, from the start of the file. The value
must be 0 when kvdByteLength
= 0.
The total number of bytes of key/value data including all
keyAndValueByteLength
fields, all
keyAndValue
fields and all
valuePadding
fields.
The offset from the start of the file of
supercompressionGlobalData
. The
value must be 0 when sgdByteLength
= 0.
The number of bytes of
supercompressionGlobalData
. For
supercompression schemes for which no reference is provided in the
Global Data Format column of Table 2, “Supercompression Schemes”. the
value must be 0.
An array, levels
, giving the offset from the start of the file and
compressed and uncompressed byte sizes of the image data for each
mip level within the Mip Level Array The array is ordered
starting with \(\textit{level}_\textit{base}\) (the level with the largest size images)
at index \(0\). The image for \(\textit{level}_p\) will be found at index
\(p\).
The offset from the start of the file of the first byte of image data
for mip level p. It is the offset of the first byte after any
mipPadding
.
The total size of the data for supercompressed mip level p.
levels[p].byteLength
is the number of bytes of pixel data in
LOD \(\textit{level}_p\). This includes all layers, all z slices, all
faces, all rows (or rows of blocks) and all pixels (or blocks) in
each row for the mip level.
The total size of the image data from
\(\texttt{levels}[\textit{num_levels}-1]\texttt{.byteOffset}\) (i.e.,
after the first mipPadding
, if any) until the end of the file is:
where
and
\(\textit{texel_block_size}\) is defined in Section 3.13.2, “mipPadding”.
levels[p].uncompressedByteLength
is the number of bytes of pixel
data in LOD \(\textit{level}_p\) after reflation from supercompression.
This includes all layers, all z slices, all faces, all rows (or
rows of blocks) and all pixels (or blocks) in each row for the mip
level. When supercompressionScheme == 0
,
levels[p].byteLength
must have the same
value as this. When supercompressionScheme == 1
, BasisLZ, the
value must be 0.
The value of a level’s uncompressedByteLength
must satisfy the
following condition:
uncompressedByteLength % (faceCount * max(1, layerCount)) == 0
Tip
|
Writers should be aware that block-compressed formats require the byte length of encoded levels be a multiple of the block size, i.e. the data is always a whole number of blocks regardless of the size in texels. The PVRTC1 format has extra restrictions. See Chapter 24 PVRTC Compressed Texture Image Formats in [KDF14]. In versions of OpenGL < 4.5 and in OpenGL ES, faces of non-array
cubemap textures (any texture where |
The Data Format Descriptor (dfDescriptor) describes the layout
of the texel blocks in data
. The full specification for this is
is Chapters 2 to
11, of the Khronos Data Format Specification [KDF14].
The dfDescriptor is partially expanded in this specification in order to provide sufficient information for a KTX file to be parsed without having to refer to [KDF14]. It consists of a total size field and one or more Descriptor Blocks (dfDescriptorBlock) described below.
Note
|
Rationale
A dfDescriptor is useful in the following cases:
|
The following restrictions must be obeyed when setting the fields of a dfDescriptorBlock.
-
If
vkFormat
is notVK_FORMAT_UNDEFINED
, the DFD’stexelBlockDimension*
,bytesPlane*
and sample information fields must match the format’s definition. ThecolorModel
must beKHR_DF_MODEL_RGBSDA
,KHR_DF_MODEL_YUVSDA
or the matching block compressed color model listed in [KDF14] Section 5.6 or its successors, currentlyKHR_DF_MODEL_BC1A
toKHR_DF_MODEL_UASTC
.KHR_DF_MODEL_YUVSDA
should be used for all non-prohibited*_422_*
formats. -
If
vkFormat
is one of the*_SRGB{,_*}
formats,transferFunction
must beKHR_DF_TRANSFER_SRGB
. -
If
vkFormat
is not one of the*_SRGB{,_*}
formats and an sRGB variant of that format exists,transferFunction
should not beKHR_DF_TRANSFER_SRGB
. -
If formats for other transfer functions are added to GPU APIs in the future similar restrictions to those just described apply. For example, if formats for the HLG transfer function which have the the suffix
_HLG
are added then -
If
vkFormat
is one of the*_[SU]INT{,_*}
formats or one of the depth, stencil, or combined depth/stencil formatscolorPrimaries
must beKHR_DF_PRIMARIES_UNSPECIFIED
andtransferFunction
must beKHR_DF_TRANSFER_UNSPECIFIED
.
Note
|
For example, On the other hand, |
Note
|
The |
Note
|
When |
Note
|
Except for the formats for which it is specified above, Still, |
Tip
|
The specification allows free choice of |
The majority of color spaces listed in Chapter 13 Transfer functions of [KDF14], with corresponding enumerators given in Section 5.8 transferFunction, define only an electro-optical transfer function (EOTF) or only an opto-electrical transfer function (OETF) as indicated by aliases for the enumerator value. When a KTX file uses an enumerator with an EOTF alias
-
applications should use the EOTF to decode prior to sampling and filtering;
-
mipmap generation should use the path EOTF → scaling → EOTF-1.
The sRGB color space is one of these, defining only an EOTF.
When a KTX file uses an enumerator with an OETF alias
-
applications should use the OETF-1 to decode prior to sampling and filtering;
-
mipmap generation should use the path OETF-1 → scaling → OETF.
Some color spaces have a non-linear opto-optical transfer function
(OOTF). That is they define an OETF and an EOTF and the EOTF is not
the inverse of the OETF. These have two separate enumerators, one
with an OETF
suffix, the other with an EOTF
suffix. When a KTX
file has the enumerator with an explicit _OETF
suffix
-
the images are scene referred and intended to be decoded to scene-linear values;
-
applications should use OETF-1 to decode prior to sampling and filtering;
-
mipmap generation should use the path OETF-1 → scaling → OETF.
When a KTX file has the enumerator with an explicit _EOTF
suffix
-
the images are display referred and intended to be decoded to display-linear values;
-
applications should use the EOTF (often derived from the OOTF) to decode prior to sampling and filtering;
-
mipmap generation should use the path EOTF → scaling → EOTF-1.
OETF and EOTF formulae are given in Chapter 13 Transfer functions in [KDF14].
There are several cases where the dfDescriptorBlock is used to
provide information beyond that given by vkFormat
.
- Premultiplied Alpha
-
KHR_DF_FLAG_ALPHA_PREMULTIPLIED
(= 1) can be set in theflags
field if the images' RGB components have been multiplied by their alpha components, otherwise it must be 0. - Basis Universal UASTC Format
-
The Universal ASTC image format (UASTC) is indicated by
colorModel
KHR_DF_MODEL_UASTC
(= 166) together withvkFormat
VK_FORMAT_UNDEFINED
(= 0). The DFD must be as described in Section 5.6.14 KHR_DF_MODEL_UASTC of [KDF14]. Images in this format must be transcoded to a GPU-supported block-compressed format or decoded to a GPU-supported uncompressed format before being uploaded to and sampled by a GPU. UASTC images can be supercompressed with Zstandard (supercompressionScheme
= 2) with or without first conditioning the data with Rate-distortion Optimization. If supercompression is used, the DFD must follow the rules described in the next subsection.This color model provides channel Ids, e.g.
KHR_DF_CHANNEL_UASTC_RGB
that must be used to indicate the effective number of components in the data. Consumers use this information to help select a transcode target. The following ids are valid and must be used for the type of data indicated.Id Value Type KHR_DF_CHANNEL_UASTC_RGB
0
3 component: opaque color. RGB components in the rgb channels.
KHR_DF_CHANNEL_UASTC_RGBA
3
4 component: color + alpha. RGB components in the rgb channels, alpha in the alpha channel.
KHR_DF_CHANNEL_UASTC_RRR
4
1 component: R component replicated in all 3 rgb channels for better compression results.
KHR_DF_CHANNEL_UASTC_RRRG
5
2 independent components: R component replicated in all 3 rgb channels and G moved to alpha for better compression results.
KHR_DF_CHANNEL_UASTC_RG
6
2 independent components. Blue & alpha should not be sampled.
Tip_UASTC_RRRG cannot be transcoded to the RG channels of an ASTC or BC7 texture. Applications using this channel id will have to use swizzles or have shaders that understand this channel layout.
The bitstream of the UASTC data is described in Chapter 25 UASTC Compressed Texture Image Format of [KDF14].
- Basis Universal ETC1S Format
-
The ETC1S image format is indicated by
colorModel
KHR_DF_MODEL_ETC1S
(= 163) together withvkFormat
VK_FORMAT_UNDEFINED
(= 0). The DFD must be as described in Section 5.6.11 KHR_DF_MODEL_ETC1S of [KDF14]. Because ETC1S does not support an alpha component, Basis Universal uses 2 slices, (planes in DFD-speak) to represent RGBA images. This color model provides the following channel ids that must be used to indicate the use of a slice.Id Value Type KHR_DF_CHANNEL_ETC1S_RGB
0
3 components: opaque color. RGB components in the slice’s rgb components.
KHR_DF_CHANNEL_ETC1S_RRR
3
1 component: R component in the slice’s r component.
KHR_DF_CHANNEL_ETC1S_GGG
4
1 component: G component in the slice’s r component. Not used independently.
KHR_DF_CHANNEL_ETC1S_AAA
15
1 component: Alpha component in the slice’s r component. Not used independently.
For better compression results, non-RGB slices may have the same value replicated in all 3 slice components.
Whether there are 1 or 2 slices depends on the pre-deflation components as detailed in the following table of valid channel id combinations.
Combination Description KHR_DF_CHANNEL_ETC1S_RGB
One slice, opaque color.
KHR_DF_CHANNEL_ETC1S_RGB
+KHR_DF_CHANNEL_ETC1S_AAA
Two slices, color + alpha
KHR_DF_CHANNEL_ETC1S_RRR
One slice, 1 component encoded as greyscale.
KHR_DF_CHANNEL_ETC1S_RRR
+KHR_DF_CHANNEL_ETC1S_GGG
Two slices, 2 independent components each encoded as greyscale.
TipKTX writers may map components of their original input images into the RGB and A components of the supercompressed image in any way they choose. They may also offer an option to apply KTXswizzle metadata prior to supercompressing an uncompressed KTX file. Images in this format must be supercompressed. A scheme designed for ETC1S streams such as BasisLZ must be used. Images must be inflated and transcoded to a GPU-supported block-compressed format or decoded to a GPU-supported uncompressed format before being uploaded to and sampled by a GPU. Because ETC1S images are supercompressed, the DFD must follow the rules described in the next subsection.
Tip
|
Whether the image has 1 or 2 slices can be determined from the DFD’s sample count. |
Note
|
Since inflation and transcoding are typically combined in a single operation, this bitstream is not visible to applications. |
Important
|
The DFD for UASTC and ETC1S must reflect the components provided as input to the Basis encoders not those of the source image. Therefore, for example, if the software checks for and removes from source image(s) alpha channel(s) that are all opaque (1.0) before submitting the data to a Basis encoder then the DFD must not have a sample with a channelType that indicates it is alpha. |
When supercompressionScheme
is not 0
the dfDescriptorBlock must preserve the colorModel
, transferFunction
,
colorPrimaries
, flags
, texelBlockDimension[0-3]
and
bytesPlane[0-7]
of the pre-deflation images along with each
sample’s channelType
, qualifiers
, bitlength
, bitOffset
,
sampleLower
, and sampleUpper
.
BasisLZ supercompression works only on ETC1S blocks so, when it is
used, texelBlockDimension[01]
must be 3, bytesPlane0
must
be 8 and, if there is an alpha plane, bytesPlane1
must also be
8.
Note
|
In the event that a block-compressed format is supercompressed the DFD will reflect the color model of the block-compressed format most of which have only one or two components. |
Note
|
Previous document revisions required |
Table 4, “Example Unsigned R + G descriptor for BasisLZ/ETC1S” shows a DFD for images that were VK_FORMAT_R8G8_UNORM
,
before encoding and deflation, i.e. they have two
unsigned 8-bit components.
~uint32_t bit~ |
|||||||||||||||||||||||||||||||
31 |
30 |
29 |
28 |
27 |
26 |
25 |
24 |
23 |
22 |
21 |
20 |
19 |
18 |
17 |
16 |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
totalSize: 60 |
|||||||||||||||||||||||||||||||
descriptorType: 0 |
vendorId: 0 |
||||||||||||||||||||||||||||||
descriptorBlockSize: 24 + (16 {times} 2) = 56 |
versionNumber: 2 |
||||||||||||||||||||||||||||||
flags: ALPHA_STRAIGHT |
transferFunction: LINEAR |
colorPrimaries: BT709 |
colorModel: ETC1S |
||||||||||||||||||||||||||||
texelBlockDimension3 |
texelBlockDimension2 |
texelBlockDimension1 |
texelBlockDimension0 |
||||||||||||||||||||||||||||
0 |
0 |
3 (= ``4'') |
3 (= ``4'') |
||||||||||||||||||||||||||||
bytesPlane3: 0 |
bytesPlane2: 0 |
bytesPlane1: 8 |
bytesPlane0: 8 |
||||||||||||||||||||||||||||
bytesPlane7: 0 |
bytesPlane6: 0 |
bytesPlane5: 0 |
bytesPlane4: 0 |
||||||||||||||||||||||||||||
F |
S |
E |
L |
channelType |
~Red sample information~ |
||||||||||||||||||||||||||
0 |
0 |
0 |
0 |
RRR |
bitLength: 63 (= ``64'') |
bitOffset: 0 |
|||||||||||||||||||||||||
samplePosition3 |
samplePosition2 |
samplePosition1 |
samplePosition0 |
||||||||||||||||||||||||||||
0 |
0 |
0 |
0 |
||||||||||||||||||||||||||||
sampleLower: 0 |
|||||||||||||||||||||||||||||||
sampleUpper: UINT32_MAX |
|||||||||||||||||||||||||||||||
F |
S |
E |
L |
channelType |
~Green sample information~ |
||||||||||||||||||||||||||
0 |
0 |
0 |
0 |
GGG |
bitLength: 63 (= ``64'') |
bitOffset: 64 |
|||||||||||||||||||||||||
samplePosition3 |
samplePosition2 |
samplePosition1 |
samplePosition0 |
||||||||||||||||||||||||||||
0 |
0 |
0 |
0 |
||||||||||||||||||||||||||||
sampleLower: 0 |
|||||||||||||||||||||||||||||||
sampleUpper: UINT32_MAX |
Table 5, “Example Signed RGB descriptor for Zstandard/ZLIB” shows a DFD for images that were VK_FORMAT_R8G8B8_SNORM
,
before deflation, i.e. have 3 signed 8-bit components.
~uint32_t bit~ |
|||||||||||||||||||||||||||||||
31 |
30 |
29 |
28 |
27 |
26 |
25 |
24 |
23 |
22 |
21 |
20 |
19 |
18 |
17 |
16 |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
totalSize: 76 |
|||||||||||||||||||||||||||||||
descriptorType: 0 |
vendorId: 0 |
||||||||||||||||||||||||||||||
descriptorBlockSize: \(24 + (16 \times 3) = 72\) |
versionNumber: 2 |
||||||||||||||||||||||||||||||
flags: ALPHA_STRAIGHT |
transferFunction: LINEAR |
colorPrimaries: BT709 |
colorModel: RGBSDA |
||||||||||||||||||||||||||||
texelBlockDimension3 |
texelBlockDimension2 |
texelBlockDimension1 |
texelBlockDimension0 |
||||||||||||||||||||||||||||
0 |
0 |
0 |
0 |
||||||||||||||||||||||||||||
bytesPlane3: 0 |
bytesPlane2: 0 |
bytesPlane1: 0 |
bytesPlane0: 3 |
||||||||||||||||||||||||||||
bytesPlane7: 0 |
bytesPlane6: 0 |
bytesPlane5: 0 |
bytesPlane4: 0 |
||||||||||||||||||||||||||||
F |
S |
E |
L |
channelType |
~Red sample information~ |
||||||||||||||||||||||||||
0 |
1 |
0 |
0 |
RED |
bitLength: 7 |
bitOffset: 0 |
|||||||||||||||||||||||||
samplePosition3 |
samplePosition2 |
samplePosition1 |
samplePosition0 |
||||||||||||||||||||||||||||
0 |
0 |
0 |
0 |
||||||||||||||||||||||||||||
sampleLower: -127 |
|||||||||||||||||||||||||||||||
sampleUpper: 127 |
|||||||||||||||||||||||||||||||
F |
S |
E |
L |
channelType |
~Green sample information~ |
||||||||||||||||||||||||||
0 |
1 |
0 |
0 |
GREEN |
bitLength: 7 |
bitOffset: 8 |
|||||||||||||||||||||||||
samplePosition3 |
samplePosition2 |
samplePosition1 |
samplePosition0 |
||||||||||||||||||||||||||||
0 |
0 |
0 |
0 |
||||||||||||||||||||||||||||
sampleLower: -127 |
|||||||||||||||||||||||||||||||
sampleUpper: 127 |
|||||||||||||||||||||||||||||||
F |
S |
E |
L |
channelType |
~Blue sample information~ |
||||||||||||||||||||||||||
0 |
1 |
0 |
0 |
BLUE |
bitLength: 7 |
bitOffset: 16 |
|||||||||||||||||||||||||
samplePosition3 |
samplePosition2 |
samplePosition1 |
samplePosition0 |
||||||||||||||||||||||||||||
0 |
0 |
0 |
0 |
||||||||||||||||||||||||||||
sampleLower: -127 |
|||||||||||||||||||||||||||||||
sampleUpper: 127 |
Called total_size
in [KDF14], dfdTotalSize
indicates the total
number of bytes in the dfDescriptor including dfdTotalSize
and
all dfdBlock
fields. dfdByteLength
must equal dfdTotalSize
.
If
the file is invalid.
Note
|
|
A Descriptor Block
as defined in
Section 4.1 of [KDF14]. The
high-order 15 bits of its first UInt32 are the descriptor_type
and the high-order 16 bits of the second UInt32 are the
descriptor_block_size
. descriptor_block_size
is mandated to
be a multiple of 4 which guarantees that the following
keyAndValueByteLength
will be aligned in a 32-bit word.
Key/Value data consists of a set of key/value pairs. The number of pairs is such that
Any file that does not meet the above condition is invalid.
KTX tools must update any key/value data affected by their operations.
For example, a tool that supports xflip or yflip operations must
update existing KTXorientation
data to reflect the result of
performing one of these. Tools must, preserve any key/value data
not affected by their operations and not modified by the user or that
they do not understand.
Key/value data must be written to the file sorted by the Unicode code points of the keys starting from a key’s first character.
Keys must not appear more than once.
The number of bytes of combined key and value data in one key/value
pair. This includes the size of the key, the required NUL byte
terminating the key and all the bytes of data in the value. If the
value is a UTF-8 string it should be NUL terminated and
keyAndValueByteLength
must include the NUL character (but code
that reads KTX files must not assume that value fields are NUL
terminated). keyAndValueByteLength
does not include the bytes in
valuePadding
.
keyAndValueByteLength
must be at least 2, that is a 1 byte key plus
its NUL terminator.
keyAndValue
contains 2 separate sections. First it contains a key
encoded in UTF-8 without a byte order mark (BOM). The key must be
terminated by a NUL character (a single 0x00 byte). Keys that begin
with the 3 ASCII characters KTX
or ktx
are reserved and must
not be used except as described by this specification (this version
of the KTX spec. defines eight keys). Immediately following the NUL
character that terminates the key is the Value data.
The Value data may consist of any arbitrary data bytes. Any byte value is allowed. It is encouraged that the value be a NUL terminated UTF-8 string without a BOM, but this is not required.
If the Value data is binary, it is a sequence of bytes rather than of words. It is up to the vendor defining the key to specify how those bytes are to be interpreted. If any bytes encode multi-byte numbers they must be in little-endian order and, if such a number appears at the start of the Value data, the key length including its terminating NUL must be a multiple of the number of bytes in the number so that the number will be properly aligned.
If the Value data is a string then the NUL termination, if present,
must be included in keyAndValueByteLength
(but programs that read
KTX files must not rely on NUL termination).
Contains between 0 and 3 bytes of value 0x00
to ensure that the
byte following the last byte in valuePadding
is at a file offset
that is a multiple of 4. This ensures that every keyAndValueByteLength
field is 4-byte aligned. This padding is included in the
kvdByteLength
field but not the
individual keyAndValueByteLength
fields.
An array of data used by certain supercompression schemes that must be available before any mip level can be inflated. Must start on the next 8-byte boundary following the key/value data.
The specification of this data block for the BasisLZ scheme is given in [basislz_gd].
Mip levels in the array are ordered from the level with the smallest size images, \(\textit{level}_p\) to that with the largest size images, \(\textit{level}_{base}\).
Note
|
Rationale
When streaming a KTX file, sending smaller mip levels first can be
used together with, e.g., the |
levelImages
is an array of Bytes holding all the image data for
a level. The offset of a level’s levelImages
is provided by the
Level Index. Images are concatenated in the order
layer, face, slice.
When supercompressionScheme != 0
these
bytes are formatted as specified in the scheme documentation.
mipPadding
is between \(0\) and
\(\operatorname{lcm}(\textit{texel_block_size}, 4) - 1\) bytes of value
0x00
. This is only required when supercompressionScheme == 0
.
Texel block size is as given for the vkFormat
value in
Format
Compatibility Classes of [VULKAN-DOCS] for all vkFormat
values
except the following three:
VkFormat | Texel Block Size |
---|---|
|
Derived from DFD |
|
4 |
|
8 |
Note
|
Padding Rationale
Since levels after the first will be naturally aligned to their
texel block size, in block-compressed formats because an integral
number of blocks is required regardless of the image size, the
majority of formats will have 0 bytes of padding between levels.
The exception is formats whose texel block size is not a multiple
of 4. Depending on the image size, these may require some |
The type of texture can be determined from the following table. Any other combination of these parameters makes the KTX file invalid.
Type | pixelWidth | pixelHeight | pixelDepth | layerCount | faceCount |
---|---|---|---|---|---|
1D |
> 0 |
0 |
0 |
0 |
1 |
2D |
> 0 |
> 0 |
0 |
0 |
1 |
3D |
> 0 |
> 0 |
> 0 |
0 |
1 |
Cubemap |
> 0 |
> 0 |
0 |
0 |
6 |
1D Array |
> 0 |
0 |
0 |
> 0 |
1 |
2D Array |
> 0 |
> 0 |
0 |
> 0 |
1 |
3D Array |
> 0 |
> 0 |
> 0 |
> 0 |
1 |
Cubemap Array |
> 0 |
> 0 |
0 |
> 0 |
6 |
VK_FORMAT_UNDEFINED
can be used
-
For custom formats that do not have any equivalent in GPU APIs.
-
For BasisLZ supercompressed data.
-
For any formats from any GPU APIs that do not have Vulkan equivalents.
-
For compressed color models in Section 5.6 of [KDF14] or successors that do not have corresponding Vulkan formats. One such format exists now, the transcodable format with colorModel
KHR_DF_MODEL_UASTC
(= 166).
If VK_FORMAT_UNDEFINED
is used and the format is known to OpenGL,
Direct3D or Metal APIs, the corresponding format metadata item
should be present.
The following sections describe ways to extend what can be contained in a KTX file. It covers three categories: formats, supercompression schemes and metadata. This specification can be periodically updated to incorporate officially recognized additions and the Document Revision incremented. Since the KTX format itself would not change the KTX version and file identifier would not change. This document serves as the registry for both official and vendor extensions.
Tip
|
The document revision can be used as a parameter for validators to guide validation. |
Consumers of KTX files must fail gracefully when encountering formats or supercompression schemes they are not prepared to handle. They must ignore or report metadata items they are not prepared to handle.
In the following, vendor encompasses independent software and hardware vendors and open source developers.
Formats are identified by one or more of the following:
New transcodable formats can be added by:
-
Creating a new color model and format specification in the Khronos Data format specification [KDF14] (as was done with UASTC).
-
Creating a new color model and format specification as above and providing a specification for a new supercompression scheme that incorporates this transcodable format (à la BasisLZ/ETC1S).
New Vulkan formats are created via Vulkan extensions.
New DXGI or Metal formats can be carried by using VK_FORMAT_UNDEFINED
together with a Data Format Descriptor, which may or may not need
a new color model, and format mapping metadata giving the DXGI or Metal
format value.
Supercompression schemes are identified by supercompressionScheme
in the
KTX header. New official schemes can be documented in updates to this
specification.
Vendors can create their own supercompression schemes. To avoid conflicts in the Scheme Id name space, those doing so must register them with Khronos as described in Section 4.3.4, “Registering Extensions”.
New official metadata items (i.e, KTX prefixed) can be documented in updates to this specification.
Vendors can register their own metadata items (key/value pairs) as described in Section 4.3.4, “Registering Extensions” and are strongly encouraged to do so to avoid potential collisions in the key name space (prefix).
Supercompression schemes and metadata items are registered by
proposing a pull-request (PR) against the default branch (currently
main
) of the KTX-Specification repository on
GitHub. See the sections below for the specific information required.
The vendor will need to create a GitHub account, if it doesn’t have one. Register the vendor’s FQDN to that account. A GitHub account handle is the preferred way of providing the required registration contact information.
Choose a short tag name to identify the vendor. Use the same tag the vendor uses for Vulkan, glTF, OpenGL etcetera extensions, if there is one. The sections below explain how the tag will be used. As a matter of courtesy and respect, please do not try to use tags which clearly belong to an existing company or project which may wish to develop extensions in the future. Khronos may decline to register extensions that are not requested in good faith.
Registration is not complete until the repository maintainer has validated and merged the PR.
Submit requests for scheme ids by proposing a pull request (PR)
against ktxspec.adoc
. The PR must add a row to
Table 3, “Vendor Supercompression Schemes1”, that uses the next available id,
and a note to Section 3.8.2, “Vendor Scheme Notes (Normative)”. Follow the instructions in
the comments at those locations. Required information for the first
includes the scheme name, author, contact information and a token
name that must incorporate the chosen tag name. The token can be
used by readers and writers to identify the scheme and vendor in
enumerations, etc. Required information for the second includes
whether to retain post compression the vkFormat
value and the Data Format Descriptor’s
color space information.
Vendors are strongly encouraged to provide the bitstream and, if
applicable, global data specifications but they are not required.
When provided, they must be put in appendices to this document and
contain anchors linked from the added row. Create an AsciiDoc file
for each in the appendices
directory named using the template
KTX_<TAG>_<name>_{bitstream,gdata}.adoc
Replace <TAG>
with the vendor’s identifying tag and <name>
with
the scheme name. Use AsciiDoc’s include::
directive to include
these appendices after the last similar include currently in this
document. Add the new files to the PR as well as edits to Makefile
that add the new files to the ktx_sources
variable.
The registration process can be split into several steps to accommodate scheme id assignment prior to scheme publication:
-
Acquire a scheme id. This is done by proposing a PR against ktxspec.adoc. The id will be reserved only once this request is accepted into the default branch.
-
Develop and test the scheme using the registered id.
-
Publish the bit stream specifications to Khronos with a PR that updates the row in the table for the previously registered id and adds the scheme documentation.
Register items by proposing a pull request (PR) against
appendices/vendor_metadata.adoc
, the source file for [vendorMetadata].
Add the metadata item(s) following the instructions in the comment
there.
Use the tag described in Section 4.3.4, “Registering Extensions” as the key prefix.
The images of any array texture can be indicated to be the frames
of a short animation sequence by including KTXanimData
metadata. Valid animation files must have the combination of
parameters outlined in Section 4.1, “Texture Type” for Array textures in
addition to KTXanimData metadata. layerCount
is
the number of frames in the video, i.e. layers become the temporal
axis.
Tip
|
Use of uncompressed images for an animation sequence will not be memory efficient. Animation sequences should be limited to block-compressed or, preferably, BasisLZ compressed textures. |
KTX files are little endian. All header fields and the
data for all uncompressed texture formats are stored in little
endian order. Readers on big-endian machines must endian convert
all header UInt32s and UInt64s and, when typeSize is
greater than 1, all data
to big endian. The data of block compressed
formats, those ending in *_BLOCK
, does not need endian converting.
If an application on a big-endian machine intends to use the sample
information in the Data Format Descriptor, the DFD must be rewritten
for the endian-converted data
as the samples describe the data as
laid out in memory.
Writers must endian convert these items to little endian on writing the file.
Rows of uncompressed pixel data are tightly packed. Each row in
memory immediately follows the end of the preceding row. I.e the
data must be packed according to the rules described in section
8.4.4.1 Unpacking of the OpenGL 4.6 specification [OPENGL46]
with GL_UNPACK_ROW_LENGTH
= 0 and GL_UNPACK_ALIGNMENT
= 1.
A KTX file can be used to store an incomplete cubemap or an array of
incomplete cubemaps. In such a case, faceCount
must
be 1 and layerCount
must be equal to the number of
faces present (in case of a single cubemap) or to the number of faces
present times the number of cubemaps (in case of a cubemap array).
The faces that are present must be indicated using the metadata key
-
KTXcubemapIncomplete
The value is a one-byte bitfield defined as:
00xxxxx1 - +X is present 00xxxx1x - -X is present 00xxx1xx - +Y is present 00xx1xxx - -Y is present 00x1xxxx - +Z is present 001xxxxx - -Z is present
Any value, not matching the mask above is invalid.
At least one face must be present, i.e., the value must not be 0.
Within the levelImages structure structure, faces must be written in the same order as with complete cubemaps: +X, -X, +Y, -Y, +Z, -Z.
When a texture is a cubemap array, missing/present faces must be the same for each element.
As with complete cubemaps, pixelHeight
must be equal
to pixelWidth
, and pixelDepth
must
be 0.
This metadata entry must not be used together with
KTXanimData
.
Texture data in a KTX file are arranged so that the first pixel in the data stream for each face and/or array element is closest to the origin of the texture coordinate system. In OpenGL that origin is conventionally described as being at the lower left, but this convention is not shared by all image file formats and content creation tools, so there is abundant room for confusion.
The desired texture axis orientation is often predetermined by, e.g. a content creation tool’s or existing application’s use of the image. Therefore it is strongly recommended that tools for generating and manipulating KTX files clearly describe their behaviour, and provide an option to specify the texture axis origin and orientation relative to the logical orientation of the source image. At minimum they should provide a choice between top-left and bottom-left as origin for 2D source images, with the positive S axis pointing right. Where possible, the preferred default is to use the logical upper-left corner of the image as the texture origin. Note that this is contrary to the standard interpretation of GL texture coordinates. However, most other APIs and the majority of texture compression tools use this convention.
When writing the logical orientation to the KTX file’s metadata, image manipulation tools and viewers must use the key
-
KTXorientation
Note that this metadata affects only the logical interpretation of the data and has no effect on the mapping from pixels in the file byte stream to texture coordinates.
The value is a NUL-terminated string formatted depending on the texture type.
Type | Format ([REGEXP]) |
---|---|
1D |
|
2D or Cubemap |
|
3D |
|
where
-
r
indicates S values increasing to the right -
l
indicates S values increasing to the left -
d
indicates T values increasing downwards -
u
indicates T values increasing upwards -
o
indicates R values increasing out from the screen (moving towards viewer) -
i
indicates R values increasing in towards the screen (moving away from viewer)
When a texture is an array, all its elements have the same orientation and when it is a cubemap, all faces have the same orientation.
Values not matching the table above are invalid.
It is recommended that viewing and editing tools support at least the following values:
-
rd
-
ru
-
rdi
-
ruo
Although other orientations can be represented, it is recommended that tools that create KTX files use only the values listed above as other values may not be widely supported by other tools.
The vkFormat
field is the primary way of describing the format
of the texture data stored in a KTX file. However when there is no
matching Vulkan format, KTX writers may use the following key-value
pairs to provide alternative API-specific enum values.
These metadata entries must not be used when the vkFormat
is not VK_FORMAT_UNDEFINED
.
For OpenGL {,ES} the mapping is specified with the key
-
KTXglFormat
The value is 12 bytes representing 3 Uint32 values:
UInt32 glInternalformat
UInt32 glFormat
UInt32 glType
For compressed formats, glFormat
and glType
must be set to zero;
and glInternalformat
must be used for providing mapping.
For Direct3D the mapping is specified with the key
-
KTXdxgiFormat__
The value is a UInt32 (4 bytes) giving the format enum value.
Desired component mapping for a texture can be indicated with the key
-
KTXswizzle
The value is a four-byte NUL-terminated string formatted as ([REGEXP]):
-
/^[rgba01]{4}$/
where each symbol represents source component (or fixed value) that
is used for red, green, blue and alpha values, thus rgba
being
a default swizzling state.
For example, rg01
means:
-
the red and green channels are sampled from the red and green texture components respectively;
-
the blue channel is set to zero, ignoring texture data;
-
the alpha channel is set to one (fully saturated), ignoring texture data.
When a channel is not present in the texture, a value of 0
must be
used for colors (red, green and blue) and a value of 1
(fully
saturated) must be used for alpha.
This metadata has no effect on depth or stencil texture formats.
Use the following formats and swizzles to map alpha-only, luminance and luminance-alpha formats.
- Alpha8
-
vkFormat
:VK_FORMAT_R8_UNORM
(9)
KTXswizzle
: 000r - Luminance8
-
vkFormat
:VK_FORMAT_R8_UNORM
(9)
KTXswizzle
: rrr1 - Luminance8Alpha8
-
vkFormat
:VK_FORMAT_R8G8_UNORM
(16)
KTXswizzle
: rrrg
Loaders may opt to detect these cases and use API-provided enums
when available, e.g. for the first case GL_ALPHA8
(when using
compatibility profile), MTLPixelFormatA8Unorm
or DXGI_FORMAT_A8_UNORM
.
KTX file writers may, and are strongly encouraged to, identify themselves by including a value with the key
-
KTXwriter
The value is a NUL-terminated UTF-8 string that will uniquely identify the tool writing the file, for example:
-
AcmeCo TexTool v1.0
Only the most recent writer should be identified. Editing tools must overwrite this value when rewriting a file originally written by a different tool.
KTX file writers may, and are strongly encouraged to, identify any non-default Basis Universal, ASTC & other block-compression encoding and supercompression options specified when the file is created by including a value with the key
-
KTXwriterScParams
The value is a NUL-terminated UTF-8 string that shows the command-line or other options used when writing the file, for example:
-
--uastc --uastc_rdo_l 2 --zcmp 5
If KTXwriterScParams
is present, KTXwriter
must also be present.
In general only the most recent writer and most recently used options should be identified unless the writer is building on operations done previously. For example if a writer is adding Zstd supercompression to a file it previously encoded in UASTC, it should append the additional options to those previously used.
By default, ASTC decoders produce pixel values with half-float
precision for HDR and linear LDR blocks. KTX file writers may
indicate that the data is compatible with more compact decoding
modes (as defined in
VK_EXT_astc_decode_mode
by using the key
-
KTXastcDecodeMode
The value is a NUL-terminated string.
rgb9e5
means that pixel values can be decoded with RGB9E5 mode.
unorm8
(valid only for LDR formats) means that pixel values can
be decoded with UNORM8 mode.
Other values are not allowed.
This metadata entry has no effect on and should not be present in KTX files that use sRGB transfer function.
This metadata entry has no effect on and should not be present in KTX files that use non-ASTC formats.
The images of an array texture can be indicated to be the frames of a short animation by using the key
-
KTXanimData
The value is 12 bytes representing 3 Uint32 values:
UInt32 duration
UInt32 timescale
UInt32 loopCount
duration
is the number of time units per frame. timescale
is the
number of time units per 1 second. Thus the duration of a frame in
seconds is \(\texttt{duration} / \texttt{timescale}\).
loopCount
indicates how many times to loop the animation. Values are:
-
0 - loops infinitely
-
1 - plays once
-
n - plays n times
This metadata entry must not be used together with
KTXcubemapIncomplete
.
// Header
0xAB, 0x4B, 0x54, 0x58, // first four bytes of Byte[12] identifier
0x20, 0x32, 0x30, 0xBB, // next four bytes of Byte[12] identifier
0x0D, 0x0A, 0x1A, 0x0A, // final four bytes of Byte[12] identifier
0x00, 0x00, 0x00, 0x00, // UInt32 vkFormat = VK_FORMAT_UNDEFINED (0)
0x01, 0x00, 0x00, 0x00, // UInt32 typeSize = 1
0x08, 0x00, 0x00, 0x00, // UInt32 pixelWidth = 8
0x08, 0x00, 0x00, 0x00, // UInt32 pixelHeight = 8
0x00, 0x00, 0x00, 0x00, // UInt32 pixelDepth = 0
0x00, 0x00, 0x00, 0x00, // UInt32 layerCount = 0
0x01, 0x00, 0x00, 0x00, // UInt32 faceCount = 0
0x01, 0x00, 0x00, 0x00, // UInt32 levelCount = 0
0x01, 0x00, 0x00, 0x00, // UInt32 supercompressionScheme = 1 (BASISLZ)
// Index
0x68, 0x00, 0x00, 0x00, // Uint32 dfdByteOffset = 0x00000068
0x3C, 0x00, 0x00, 0x00, // UInt32 dfdByteLength = 0x0000003C
0xC4, 0x00, 0x00, 0x00, // UInt32 kvdByteOffset = 0x000000C4
0x58, 0x00, 0x00, 0x00, // UInt32 kvdByteLength = 0x00000058
0x20, 0x01, 0x00, 0x00, // UInt64 sgdByteOffset = 0x0000000000000120
0x00, 0x00, 0x00, 0x00,
0x90, 0x00, 0x00, 0x00, // UInt64 sgdByteLength = 0x0000000000000090
0x00, 0x00, 0x00, 0x00,
// Level Index
0xB0, 0x01, 0x00, 0x00, // UInt64 level[0].byteOffset = 0x00000000000001B0
0x00, 0x00, 0x00, 0x00,
0x03, 0x00, 0x00, 0x00, // UInt64 level[0].byteLength = 0x0000000000000003
0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, // UInt64 level[0].uncompressedByteLength = 0
0x00, 0x00, 0x00, 0x00,
// DFD
0x3C, 0x00, 0x00, 0x00, // UInt32 dfdTotalSize = 0x3C (60)
0x00, 0x00, 0x00, 0x00, // vendorId = 0 (17 bits), descriptorType = 0
0x02, 0x00, 0x38, 0x00, // versionNumber = 2, descriptorBlockSize = 0x38 (56)
0xA3, 0x01, 0x02, 0x00, // colorModel = ETC1S (163), primaries = BT709 (1)
// transferFunction = SRGB (2), flags = 0
0x03, 0x03, 0x00, 0x00, // texelBlockDimension[[0-3] = 3, 3, 0, 0
0x00, 0x00, 0x00, 0x00, // bytesPlane[0-3] = 0
0x00, 0x00, 0x00, 0x00, // bytesPlane[4-7] = 0
// DFD sample information, sample 0
0x00, 0x00, 0x3F, 0x00, // bitOffset = 0 bitLength = 0x3F (63),
// channelType = RGB (0), qualifiers = 0
0x00, 0x00, 0x00, 0x00, // samplePosition[0-3] = 0
0x00, 0x00, 0x00, 0x00, // sampleLower = 0
0xFF, 0xFF, 0xFF, 0xFF, // sampleUpper = 0xFFFFFFFF (UINT_MAX)
// Sample 1
0x40, 0x00, 0x3F, 0x0F, // bitOffset = 0x40 (64) bitLength = 0x3F (63),
// channelType = AAA (0x0F), qualifiers = 0
0x00, 0x00, 0x00, 0x00, // samplePosition[0-3] = 0
0x00, 0x00, 0x00, 0x00, // sampleLower = 0
0xFF, 0xFF, 0xFF, 0xFF, // sampleUpper = 0xFFFFFFFF (UINT_MAX)
// Key/Value Data
0x12, 0x00, 0x00, 0x00, // keyAndValueByteLength = 18 (0x12)
0x4B, 0x54, 0x58, 0x6F, // KTXo
0x72, 0x69, 0x65, 0x6E, // rien
0x74, 0x61, 0x74, 0x69, // tati
0x6F, 0x6E, 0x00, 0x72, // on NUL r
0x64, 0x00, 0x00, 0x00, // d <3 bytes of valuePadding>
0x3B, 0x00, 0x00, 0x00, // keyAndValueByteLength = 59 (0x3B)
0x4B, 0x54, 0x58, 0x77, // KTXw
0x72, 0x69, 0x74, 0x65, // rite
0x72, 0x00, 0x74, 0x6F, // r NUL to
0x6B, 0x74, 0x78, 0x20, // ktx SPACE
0x76, 0x34, 0x2E, 0x30, // v4.0
0x2E, 0x5F, 0x5F, 0x64, // .__d
0x65, 0x66, 0x61, 0x75, // efau
0x6C, 0x74, 0x5F, 0x5F, // lt__
0x20, 0x2F, 0x20, 0x6C, // SPACE / SPACE l
0x69, 0x62, 0x6B, 0x74, // ibkt
0x78, 0x20, 0x76, 0x34, // x v4
0x2E, 0x30, 0x2E, 0x5F, // .0._
0x5F, 0x64, 0x65, 0x66, // _def
0x61, 0x75, 0x6C, 0x74, // ault
0x5F, 0x5F, 0x00, 0x00, // __ <2 bytes of valuePadding>
0x00, 0x00, 0x00, 0x00, // 4 bytes of padding.
// Supercompression Global Data
0x02, 0x00, 0x02, 0x00, // UInt16 endpointCount = 2, UInt16 selectorCount = 2
0x2D, 0x00, 0x00, 0x00, // UInt32 endpointsByteLength = 0x2D
0x09, 0x00, 0x00, 0x00, // UInt32 selectorsByteLength = 0x09
0x2E, 0x00, 0x00, 0x00, // Uint32 tablesByteLength = 0x2E
0x00, 0x00, 0x00, 0x00, // Uint32 extendedByteLength = 0
// imageDesc[0]
0x00, 0x00, 0x00, 0x00, // UInt32 flags = 0
0x00, 0x00, 0x00, 0x00, // UInt32 rgbSliceByteOffset = 0
0x02, 0x00, 0x00, 0x00, // UInt32 rgbSliceByteLength = 2
0x02, 0x00, 0x00, 0x00, // UInt32 alphaSliceByteOffset = 0x02
0x01, 0x00, 0x00, 0x00, // UInt32 alphaSliceByteLength = 1
// endpointsData
0x01, 0xC0, 0x04, 0x00,
0x00, 0x00, 0x00, 0x00,
0x00, 0x02, 0x04, 0x98,
0x1B, 0x20, 0x00, 0x00,
0x00, 0x08, 0xC3, 0x36,
0x91, 0x3E, 0x91, 0x00,
0x60, 0x02, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00,
0x81, 0x00, 0x4C, 0x01,
0x10, 0x00, 0x00, 0x00,
0x00, 0x20, 0x59, 0xC0,
0x3D,
// selectorsData
0x54, 0x55, 0x55,
0x55, 0xAD, 0xAA, 0xAA,
0xAA, 0x02,
// tablesData
0x14, 0xC0,
0x44, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x12,
0x41, 0x00, 0x98, 0x00,
0x00, 0x00, 0x00, 0x00,
0x00, 0x40, 0x18, 0x02,
0xA2, 0x04, 0x0C, 0x00,
0x00, 0x00, 0x83, 0x76,
0x7B, 0x49, 0x04, 0xA2,
0x20, 0x00, 0x4C, 0x00,
0x08, 0x00, 0x00, 0x00,
0x00, 0x20, 0x02, 0x01,
// Level 0 image data
0x4E, 0x0E, 0x04
Permission is expressly granted to IANA to copy this section as necessary for managing the Media Types registry.
-
How to refer to the DF descriptor block?
Discussion: There is no such data type as
dfDesriptorBlock
but using primitive types would effectively mean repeating the definition of a descriptor block here which we do not want to do.Resolved: Show that
dfDescriptorBlock
is used as a shorthand for [KDF14]'s Descriptor block. -
How to handle endianness of the DF descriptor block?
Discussion: The DF spec says data structures are assumed to be little-endian for purposes of data transfer. This is incompatible with the net which is big-endian and incompatible with
endianness
. What should we do?_Resolved._All fields and data in KTX files will be little endian as that is the endianness of the vast majority of machines.
-
Can we guarantee the DF descriptor blocks are always a multiple of 4 bytes?
Discussion The Khronos Basic Data Format Descriptor Block is a multiple of 4 bytes (24 + 16 x number of samples). Is there anything to require that extensions' block sizes be a multiple of 4 bytes? Need to maintain alignment.
Resolved: The Data Format Specification has been updated to recommend but not require padding. This spec. will require padding.
-
Should KTX support level sizes > 4GB?
Discussion: Users have reported having base levels > 4GB for 3D textures. For this the
imageSize
field needs to be 64-bits. Loaders on 32-bit systems will have to ensure correct handling of this and check thatimageSize
<= 4GB, before loading.Resolved: Be future proof and make all image-size related fields 64 bits.
-
Should KTX provide a way to distinguish between rectangle and regular 2D textures?
Discussion: The difference is that unnormalized texel coordinates are used for sampling via a special sampler type in GLSL and, in the case of OpenGL {,ES}, the special TEXTURE_RECTANGLE target is used. If needed this could be supported by a metadata item instructing to use unnormalized texel coordinates.
Resolved: Not at this time. Should the need emerge, a metadata item can be added.
-
Should KTX provide a way to distinguish between 1D textures and buffer textures?
Discussion: The difference is how you use the data in OpenGL. With buffer textures the image data is stored in a buffer object. Note that a TextureView can be used to give a different view of the data so supporting buffer textures probably requires metadata to indicate a preferred view as well as metadata to indicate the data should be loaded in a buffer.
Resolved: Not at this time. Should the need emerge, metadata items can be added.
-
Should KTX drop the
gl*
fields?Discussion: Narrowing down and enforcing the valid combinations of
glFormat
,glInternalFormat
andglType
is fraught with issues. The spec. could be simplified by dropping them and having onlyvkFormat
. The spec can include a table showing a standard mapping from thevkFormat
value to aglInternalFormat
,glFormat
andglType
combination.Resolved: Drop the
gl*
fields. OpenGL and OpenGL ES loaders can include code to do the mapping based on table which has been added to the spec. Such code is estimated to be about 6 kbytes. -
Use alphanumeric characters or binary values for component swizzles?
Discussion: Values in the swizzle metadata could be either a character from the set [01rgba] or numeric values corresponding to the VkComponentSwizzle enum values from 0 to 6. In the latter case values could be expressed in binary or as numeric characters. The GL token values have been eliminated from this choice because they are not user friendly.
Resolved: Use alphanumeric characters from the set [01rgba].
-
Is anything needed to support sparse textures?
Discussion: Sparse textures are provided by the
GL_ARB_sparse_textures
extension and are a standard feature of Vulkan. Are any additional KTX features needed to support them?Resolved: No. Nothing is seen to be required.
-
Should KTX support metadata for effective use of Vulkan SCALED formats?
Discussion: Vulkan SCALED formats convert int (or uint) values to unnormalized floating point values, equivalent to specifying a value of
GL_FALSE
for thenormalized
parameter toglVertexAttribFormat
. Generally when using such data, associated scale and bias values are folded into the transformation matrix. Should KTX specify standard metadata for these?Resolved: No. These formats will not be supported. They are primarily for vertex data and several Vulkan vendors have said they can’t support them as texture formats. Also a DFD cannot distinguish these from
int
values having the same bit pattern. -
Should the supercompression scheme be applied per-mip-level?
Discussion: Should each mip level be supercompressed independently or should the scheme, zlib, zstd, etc., be applied to all levels as a unit? The latter may result in slightly smaller size though that is unclear. However it would also mean levels could not be streamed or randomly accessed.
Resolved: Yes. The benefits of streaming and random access outweigh what is expected to be a small increase in size.
-
Should we remove row padding from uncompressed image data?
Discussion: Row padding was added to KTX so that data would have the default GL_UNPACK_ALIGNMENT of 4, which was chosen to help speed up DMA of rows by the GPU. Modern architectures are apparently not sensitive to this as evidenced by Vulkan deliberately omitting any equivalent of GL_UNPACK_ALIGNMENT. Thus an annoying chunk of code is required to upload row-padded images to Vulkan.
Resolved: Remove this and cube padding. Formats that would need padding have texel sizes that are less than 4 bytes so no benefit is obtained by starting cube faces or rows of such images at 4-byte multiples.
-
Should we require content checksums anywhere?
Discussion: Modern transmission mechanisms, e.g, HTTP2, provide good robustness so checksums are less important than they used to be. Some supercompressions schemes have checksum which may be optional.
Resolved: No. We can rely on modern transmission mechanisms. However if the supercompression scheme includes a checksum readers should verify it.
-
Should we use the DFD to indicate the number of components in Basis Universal supercompressed data?
Discussion: Basis Universal compressed data may have 1, 2, 3 or 4 components. The number of components affects the choice of transcode target format. The information could be provided within the supercompression global data or by the DFD. Currently presence of alpha slices, but not necessarily an alpha component, is indicated by a flag in the global data. The number of components is needed by applications that may have no knowledge of the original images.
Resolved: Yes. The supercompression global data gives information about the Basis Universal compressed data not about the images. The DFD contains this information prior to supercompression. It makes sense to preserve it. Implementations will then have a consistent place to query this information.
-
[KDF14] Khronos Data Format Specification 1.4. Andrew Garrard. The Khronos Group.
-
[OESCPT] GL_OES_compressed_paletted_texture. Aaftab Munshi. The Khronos Group, July 2003.
-
[OPENGL46] The OpenGL® Graphics System, A Specification (Version 4.6 (Core Profile)). Mark Segal, Kurt Akeley; Editor: Jon Leech. The Khronos Group, July 2017.
-
[REGEXP] Standard ECMA-262 5.1 Edition, Section 15.10: RegExp (Regular Expression) Objects. Ecma International, June 2011.
-
[RFC1950] ZLib Compressed Data Data Format Specification version 3.3. L. Peter Deutsch, Jean-Loup Gailly. IETF Network Working Group, May 1996.
-
[RFC1951] DEFLATE Compressed Data Format Specification version 1.3. L. Peter Deutsch. IETF Network Working Group, May 1996.
-
[RFC2119] Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF Network Working Group, March 1997.
-
[RFC8478] Zstandard Compression and the application/zstd Media Type.. Y. Collet, M. Kucherawy, Ed. Internet Engineering Task Force (IETF), October 2018.
-
[VULKAN] Vulkan® 1.n.p - A Specification. The Khronos Group, February 2025.
-
[VULKAN-DOCS] Vulkan® Documentation site. An alternative view of the Vulkan specification. The Khronos Group, February 2025.
Note
|
The Vulkan specification is a living document that is updated weekly with corrections, clarifications and newly released extensions. The version as of this writing is 1.4.308. References to the specification do not imply that KTX header field values are limited solely to those in the referenced sections or tables of 1.4.308. These values may be supplemented by extensions or new core versions. They also do not imply that all of the texture types can be loaded in any particular version of OpenGL {,ES} or Vulkan. |
-
[RDO] Rate-distortion optimization. The ryg blog, December 18th, 2018.
The KTX cubemap coordinate system in Section 3.6, “faceCount” is directly compatible with the Vulkan and OpenGL cube samplers described by the face selection tables and equations for calculating (s, t) in Cube Map Face Selection and Transformations of [VULKAN-DOCS] and section 8.13 Cube Map Texture Selection of [OPENGL46].
Figure 1, “Cubemap Coordinate System” shows graphically how the cubemap images should be arranged.
If the face orientation is not rd
, maintaining compatibility with
the cube samplers may require changing the relative positions of
faces, e.g. swapping +Y and -Y faces. To keep things simple rd
must
always be used.
If using a skybox to render the cubemap, the (s, t, r) coordinates passed to the cubemap sampler need to match the KTX cubemap coordinate system, that is left-handed with +Y up, +Z forward and +X on the right.
A widely used object space in OpenGL is right-handed with +Y up, +Z out of the screen and +X to the right. If using this, you can transform your skybox cube coordinates to the necessary left-handed system by multiplying either the X or the Z coordinate by -1. The former places the +Z face in the +Z direction so, if using OpenGL’s default view, that face will be behind you. The latter places the +Z face in the -Z direction so it will be in front of you. Failure to do one of these things will result in the skybox scene being a mirror image of reality, a common error in samples found on the web.
Vulkan apps often use a similar but left-handed object space with +Y down, +Z out of the screen (behind the default view) and +X to the right. To transform these skybox coordinates to the cubemap’s coordinate system, either
-
multiply both Y and Z by -1 to keep +Y up and place the +Z face in -Z direction, or
-
multiply both Y and X by -1 to keep +Y up and place the +Z face in +Z direction.
Failure to do one of these will result in the cubemap top and bottom faces being swapped.
Caution
|
This appendix is non-normative. |
Caution
|
Provided mappings for BGR(A) formats are based on non-ES OpenGL specifications. See the relevant OpenGL ES extensions for more options. |
Caution
|
On OpenGL ES 2.0 and WebGL 1.0, half-float data type is provided via GL_OES_texture_half_float extension that defines different enum name (GL_HALF_FLOAT_OES ) and value (0x8D61 ) than other GL APIs.
|
Caution
|
Some vendor-specific extensions (e.g. GL_NV_depth_buffer_float ) define custom enum values for symbols used in the ratified specifications.
|
Mapping of vkFormat
values to OpenGL, Direct3D and Metal
-
vkFormat
added. -
OpenGL format information fields removed.
-
Data format descriptor added.
-
Supercompression added.
-
Transcodable format support added.
-
Files always little endian.
-
Swizzle and writer id metadata added.
-
Row and cube padding removed.
-
Mip level alignment (
mipPadding
) changed to match GPU requirements. -
Mip level order changed so smallest level is first.
Document Revision | Date | Remark |
---|---|---|
pr-draft1 |
2020-08-01 |
|
pr-draft2 |
2020-09-04 |
|
0 |
2021-04-18 |
|
1 |
2022-12-09 |
|
2 |
2023-09-07 |
|
3 |
2024-02-20 |
|
4 |
2025-02-22 |
|
Thanks to Dominic Agoro-Ombaka for designing the KTX logo and icons.
Thanks to Rich Geldreich for inventing transcodable textures and BasisLZ and providing documentation of them.
Thanks to Alexey Knyazev for polishing Rich’s documentation and for enormous help tightening the specification and removing potential conflicts.
Thanks to David Wilkinson for chairing the initial effort.