GLSL/extensions/ext/GL_EXT_float8_e5m2_e4m3.txt at main · KhronosGroup/GLSL

History

201 lines (135 loc) · 7.61 KB

Raw

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

Name

EXT_float8_e5m2_e4m3

Name Strings

GL_EXT_float_e5m2

GL_EXT_float_e4m3

Contact

Jeff Bolz, NVIDIA (jbolz 'at' nvidia.com)

Contributors

Status

Draft.

Version

Last Modified: June 4, 2025

Revision: 1

Dependencies

This extension can be applied to OpenGL GLSL versions 4.50

(#version 450) and higher.

This extension can be applied to OpenGL ES ESSL versions 3.20

(#version 320) and higher.

This extension is written against the OpenGL Shading Language

Specification, version 4.60.8, dated August 14, 2023.

Overview

This extension introduces two new 8-bit floating point types known as "e4m3"

and "e5m2". The names describe how many exponent and mantissa bits are in

the type. These types are primarily intended for machine learning usage and

only a few operations are supported, including loading from and storing to

memory, conversion to/from other types, and optionally cooperative matrix

support.

Mapping to SPIR-V

-----------------

For informational purposes (non-normative), the following is an

expected way for an implementation to map GLSL constructs to SPIR-V

constructs:

floate5m2_t -> OpTypeFloat 8 FPEncodingFloat8E5M2EXT

floate4m3_t -> OpTypeFloat 8 FPEncodingFloat8E4M3EXT

floate5m2BitsToIntEXT,

floate5m2BitsToUintEXT,

intBitsToFloate5m2EXT,

uintBitsToFloate5m2EXT,

floate4m3BitsToIntEXT,

floate4m3BitsToUintEXT,

intBitsToFloate4m3EXT,

uintBitsToFloate4m3EXT -> OpBitcast

saturatedConvertEXT -> Op*Convert with SaturatedToLargestFloat8NormalConversionEXT

Modifications to the OpenGL Shading Language Specification, Version 4.60

Including the following lines in a shader can be used to control the

language features described in this extension:

#extension GL_EXT_float_e5m2 : <behavior>

#extension GL_EXT_float_e4m3 : <behavior>

where <behavior> is as specified in section 3.3.

GL_EXT_float_e5m2 must be enabled to use the floate5m2_t types.

GL_EXT_float_e4m3 must be enabled to use the floate4m3_t types.

New preprocessor #defines are added to the OpenGL Shading Language:

#define GL_EXT_float_e5m2 1

#define GL_EXT_float_e4m3 1

Modify Section 3.6, Keywords

(add to list of keywords)

floate5m2_t fe5m2vec2 fe5m2vec3 fe5m2vec4

floate4m3_t fe4m3vec2 fe4m3vec3 fe4m3vec4

Modify Section 4.1, Basic Types

Add entries to the table of Transparent Types:

floate5m2_t a floating-point scalar with a leading sign bit, 5 exponent bits, and 2 mantissa bits

fe5m2vec2 a two-component floate5m2_t vector

fe5m2vec3 a three-component floate5m2_t vector

fe5m2vec4 a four-component floate5m2_t vector

floate4m3_t a floating-point scalar with a leading sign bit, 4 exponent bits, and 3 mantissa bits

fe4m3vec2 a two-component floate4m3_t vector

fe4m3vec3 a three-component floate4m3_t vector

fe4m3vec4 a four-component floate4m3_t vector

Modify Section 4.1.10, Implicit Conversions

Add the following implicit conversions:

Type of expression Can be implicitly converted to

floate5m2_t bfloat16_t, float16_t, float32_t, float64_t

floate4m3_t bfloat16_t, float16_t, float32_t, float64_t

(and all corresponding vector promotions)

Modify Section 5.4.1, Conversion and Scalar Constructors

Add constructors for floate5m2_t and floate4m3_t to convert to and from

every other floating point and integer type. There are no conversion to or

from boolean types.

Modify Section 5.9, Expressions

Arithmetic operators are not supported on floate5m2_t, floate4m3_t, or

vectors of those types.

Modify Chapter 8, Built-In Functions

Add "genFE5M2Type" as an alias for floate5m2_t, fe5m2vec2, fe5m2vec3, fe5m2vec4.

Add "genFE4M3Type" as an alias for floate4m3_t, fe4m3vec2, fe4m3vec3, fe4m3vec4.

Add bitcast functions:

genI8Type floate5m2BitsToIntEXT(genFE5M2Type value);

genU8Type floate5m2BitsToUintEXT(genFE5M2Type value);

genFE5M2Type intBitsToFloate5m2EXT(genI8Type value);

genFE5M2Type uintBitsToFloate5m2EXT(genU8Type value);

genI8Type floate4m3BitsToIntEXT(genFE4M3Type value);

genU8Type floate4m3BitsToUintEXT(genFE4M3Type value);

genFE4M3Type intBitsToFloate4m3EXT(genI8Type value);

genFE4M3Type uintBitsToFloate4m3EXT(genU8Type value);

Add functions to do clamped conversions to the new types. These are

equivalent to constructing the result type from the source value, except

out of range values are replaced with the largest representable normal

value of the same sign.

void saturatedConvertEXT(out genFE5M2Type result, genF16Type value);

void saturatedConvertEXT(out genFE5M2Type result, genBF16Type value);

void saturatedConvertEXT(out genFE5M2Type result, genFType value);

void saturatedConvertEXT(out genFE5M2Type result, genDType value);

void saturatedConvertEXT(out genFE4M3Type result, genF16Type value);

void saturatedConvertEXT(out genFE4M3Type result, genBF16Type value);

void saturatedConvertEXT(out genFE4M3Type result, genFType value);

void saturatedConvertEXT(out genFE4M3Type result, genDType value);

// valid if the result type is constructible from the value type

void saturatedConvertEXT(out coopmat result, coopmat value);

Modify Section 8.X, Cooperative Matrix Functions

Add overloads for cooperative matrix load and store functions:

void coopMatLoad(out coopmat m, volatile coherent floate5m2_t[] buf, uint element, uint stride, int matrixLayout);

void coopMatLoad(out coopmat m, volatile coherent fe5m2vec2[] buf, uint element, uint stride, int matrixLayout);

void coopMatLoad(out coopmat m, volatile coherent fe5m2vec4[] buf, uint element, uint stride, int matrixLayout);

void coopMatLoad(out coopmat m, volatile coherent floate4m3_t[] buf, uint element, uint stride, int matrixLayout);

void coopMatLoad(out coopmat m, volatile coherent fe4m3vec2[] buf, uint element, uint stride, int matrixLayout);

void coopMatLoad(out coopmat m, volatile coherent fe4m3vec4[] buf, uint element, uint stride, int matrixLayout);

void coopMatStore(coopmat m, volatile coherent out floate5m2_t[] buf, uint element, uint stride, int matrixLayout);

void coopMatStore(coopmat m, volatile coherent out fe5m2vec2[] buf, uint element, uint stride, int matrixLayout);

void coopMatStore(coopmat m, volatile coherent out fe5m2vec4[] buf, uint element, uint stride, int matrixLayout);

void coopMatStore(coopmat m, volatile coherent out floate4m3_t[] buf, uint element, uint stride, int matrixLayout);

void coopMatStore(coopmat m, volatile coherent out fe4m3vec2[] buf, uint element, uint stride, int matrixLayout);

void coopMatStore(coopmat m, volatile coherent out fe4m3vec4[] buf, uint element, uint stride, int matrixLayout);

Modify Chapter 9, Shading Language Grammar for Core Profile

(Add to token list)

FLOATE5M2_T FE5M2VEC2 FE5M2VEC3 FE5M2VEC4

FLOATE4M3_T FE4M3VEC2 FE4M3VEC3 FE4M3VEC4

(Add to type_specifier_non_array)

FLOATE5M2_T FE5M2VEC2 FE5M2VEC3 FE5M2VEC4

FLOATE4M3_T FE4M3VEC2 FE4M3VEC3 FE4M3VEC4

Issues

(1) Why no conversions to/from boolean types?

RESOLVED: glslang uses OpFUnordNotEqual to convert float->bool, and this is

not supported for the new types.

Revision History

Revision 1

- Internal revisions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

GL_EXT_float8_e5m2_e4m3.txt

Latest commit

History

GL_EXT_float8_e5m2_e4m3.txt

File metadata and controls