NV_mesh_shader

Name

NV_mesh_shader

Name String

GL_NV_mesh_shader

Contact

Christoph Kubisch, NVIDIA (ckubisch 'at' nvidia.com)
Pat Brown, NVIDIA (pbrown 'at' nvidia.com)

Contributors

Yury Uralsky, NVIDIA
Tyson Smith, NVIDIA
Pyarelal Knowles, NVIDIA

Status

Shipping

Version

Last Modified Date:     September 5, 2019
NVIDIA Revision:        5

Number

OpenGL Extension #527
OpenGL ES Extension #312

Dependencies

This extension is written against the OpenGL 4.5 Specification
(Compatibility Profile), dated June 29, 2017.

OpenGL 4.5 or OpenGL ES 3.2 is required.

This extension requires support for the OpenGL Shading Language (GLSL)
extension "NV_mesh_shader", which can be found at the Khronos Group Github
site here:

    https://github.com/KhronosGroup/GLSL

This extension interacts with ARB_indirect_parameters and OpenGL 4.6.

This extension interacts with NV_command_list.

This extension interacts with ARB_draw_indirect and
NV_vertex_buffer_unified_memory.

This extension interacts with OVR_multiview

Overview

This extension provides a new mechanism allowing applications to use two
new programmable shader types -- the task and mesh shader -- to generate
collections of geometric primitives to be processed by fixed-function
primitive assembly and rasterization logic.  When the task and mesh
shaders are drawn, they replace the standard programmable vertex
processing pipeline, including vertex array attribute fetching, vertex
shader processing, tessellation, and the geometry shader processing.

New Procedures and Functions

  void DrawMeshTasksNV(uint first, uint count);

  void DrawMeshTasksIndirectNV(intptr indirect);

  void MultiDrawMeshTasksIndirectNV(intptr indirect,
                                    sizei drawcount,
                                    sizei stride);

  void MultiDrawMeshTasksIndirectCountNV( intptr indirect,
                                          intptr drawcount,
                                          sizei maxdrawcount,
                                          sizei stride);

New Tokens

Accepted by the <type> parameter of CreateShader and returned by the
<params> parameter of GetShaderiv:

    MESH_SHADER_NV                                      0x9559
    TASK_SHADER_NV                                      0x955A

Accepted by the <pname> parameter of GetIntegerv, GetBooleanv, GetFloatv,
GetDoublev and GetInteger64v:

    MAX_MESH_UNIFORM_BLOCKS_NV                          0x8E60
    MAX_MESH_TEXTURE_IMAGE_UNITS_NV                     0x8E61
    MAX_MESH_IMAGE_UNIFORMS_NV                          0x8E62
    MAX_MESH_UNIFORM_COMPONENTS_NV                      0x8E63
    MAX_MESH_ATOMIC_COUNTER_BUFFERS_NV                  0x8E64
    MAX_MESH_ATOMIC_COUNTERS_NV                         0x8E65
    MAX_MESH_SHADER_STORAGE_BLOCKS_NV                   0x8E66
    MAX_COMBINED_MESH_UNIFORM_COMPONENTS_NV             0x8E67

    MAX_TASK_UNIFORM_BLOCKS_NV                          0x8E68
    MAX_TASK_TEXTURE_IMAGE_UNITS_NV                     0x8E69
    MAX_TASK_IMAGE_UNIFORMS_NV                          0x8E6A
    MAX_TASK_UNIFORM_COMPONENTS_NV                      0x8E6B
    MAX_TASK_ATOMIC_COUNTER_BUFFERS_NV                  0x8E6C
    MAX_TASK_ATOMIC_COUNTERS_NV                         0x8E6D
    MAX_TASK_SHADER_STORAGE_BLOCKS_NV                   0x8E6E
    MAX_COMBINED_TASK_UNIFORM_COMPONENTS_NV             0x8E6F

    MAX_MESH_WORK_GROUP_INVOCATIONS_NV                  0x95A2
    MAX_TASK_WORK_GROUP_INVOCATIONS_NV                  0x95A3

    MAX_MESH_TOTAL_MEMORY_SIZE_NV                       0x9536
    MAX_TASK_TOTAL_MEMORY_SIZE_NV                       0x9537

    MAX_MESH_OUTPUT_VERTICES_NV                         0x9538
    MAX_MESH_OUTPUT_PRIMITIVES_NV                       0x9539

    MAX_TASK_OUTPUT_COUNT_NV                            0x953A

    MAX_DRAW_MESH_TASKS_COUNT_NV                        0x953D

    MAX_MESH_VIEWS_NV                                   0x9557

    MESH_OUTPUT_PER_VERTEX_GRANULARITY_NV               0x92DF
    MESH_OUTPUT_PER_PRIMITIVE_GRANULARITY_NV            0x9543


Accepted by the <pname> parameter of GetIntegeri_v, GetBooleani_v,
GetFloati_v, GetDoublei_v and GetInteger64i_v:

    MAX_MESH_WORK_GROUP_SIZE_NV                         0x953B
    MAX_TASK_WORK_GROUP_SIZE_NV                         0x953C


Accepted by the <pname> parameter of GetProgramiv:

    MESH_WORK_GROUP_SIZE_NV                             0x953E
    TASK_WORK_GROUP_SIZE_NV                             0x953F

    MESH_VERTICES_OUT_NV                                0x9579
    MESH_PRIMITIVES_OUT_NV                              0x957A
    MESH_OUTPUT_TYPE_NV                                 0x957B

Accepted by the <pname> parameter of GetActiveUniformBlockiv:

    UNIFORM_BLOCK_REFERENCED_BY_MESH_SHADER_NV          0x959C
    UNIFORM_BLOCK_REFERENCED_BY_TASK_SHADER_NV          0x959D

Accepted by the <pname> parameter of GetActiveAtomicCounterBufferiv:

    ATOMIC_COUNTER_BUFFER_REFERENCED_BY_MESH_SHADER_NV  0x959E
    ATOMIC_COUNTER_BUFFER_REFERENCED_BY_TASK_SHADER_NV  0x959F

Accepted in the <props> array of GetProgramResourceiv:

    REFERENCED_BY_MESH_SHADER_NV                        0x95A0
    REFERENCED_BY_TASK_SHADER_NV                        0x95A1

Accepted by the <programInterface> parameter of GetProgramInterfaceiv,
GetProgramResourceIndex, GetProgramResourceName, GetProgramResourceiv,
GetProgramResourceLocation, and GetProgramResourceLocationIndex:

    MESH_SUBROUTINE_NV                                  0x957C
    TASK_SUBROUTINE_NV                                  0x957D

    MESH_SUBROUTINE_UNIFORM_NV                          0x957E
    TASK_SUBROUTINE_UNIFORM_NV                          0x957F

Accepted by the <stages> parameter of UseProgramStages:

    MESH_SHADER_BIT_NV                                  0x00000040
    TASK_SHADER_BIT_NV                                  0x00000080

Modifications to the OpenGL 4.5 Specification (Compatibility Profile)

Modify Chapter 3, Dataflow Model, p. 33

(insert at the end of the section after Figure 3.1, p. 35)

Figure 3.2 shows a block diagram of the alternate mesh processing pipeline
of GL.  This pipeline produces a set of output primitives similar to the
primitives produced by the conventional GL vertex processing pipeline.

Work on the mesh pipeline is initiated by the application drawing a
set of mesh tasks via an API command.  If an optional task shader is
active, each task triggers the execution of a task shader work group that
will generate a new set of tasks upon completion.  Each of these spawned
tasks, or each of the original drawn tasks if no task shader is
present, triggers the execution of a mesh shader work group that produces
an output mesh with a variable-sized number of primitives assembled from
vertices in the output mesh.  The primitives from these output meshes are
processed by the rasterization, fragment shader, per-fragment-operations,
and framebuffer pipeline stages in the same manner as primitives produced
from draw calls sent to the conventional vertex processing pipeline
depicted in Figure 3.1.

   Conventional   From Application
     Vertex             |
    Pipeline            v
                   Draw Mesh Tasks     <----- Draw Indirect Buffer
    (Fig 3.1)           |
        |           +---+-----+
        |           |         |
        |           |         |
        |           |    Task Shader ---+
        |           |         |         |
        |           |         v         |
        |           |  Task Generation  |     Image Load/Store
        |           |         |         |     Atomic Counter
        |           +---+-----+         |<--> Shader Storage
        |               |               |     Texture Fetch
        |               v               |     Uniform Block
        |         Mesh Shader ----------+
        |               |               |
        +-------------> +               |
                        |               |
                        v               |
                   Rasterization        |
                        |               |
                        v               |
                  Fragment Shader ------+
                        |
                        v
              Per-Fragment Operations
                        |
                        v
                  Framebuffer

  Figure 3.2, GL Mesh Processing Pipeline


Modify Chapter 7, Programs and Shaders, p. 84

(Change the sentence starting with "Shader stages including vertex shaders")

Shader stages including vertex shaders, tessellation control shaders,
tessellation evaluation shaders, geometry shaders, mesh shaders, task
shaders, fragment shaders, and compute shaders can be created, compiled, and
linked into program objects

(replace the sentence starting with "A single program
  object can contain all of these shaders, or any subset thereof.")

Mesh and Task shaders affect the assembly of primitives from
groups of shader invocations (see chapter X).
A single program object cannot mix mesh and task shader stages
with vertex, tessellation or geometry shader stages. Furthermore
a task shader stage cannot be combined with a fragment shader stage
when the mesh shader stage is omitted. Other combinations as well
as their subsets are possible.

Modify Section 7.1, Shader Objects, p. 85

(add following entries to table 7.1)

    type            | Shader Stage
   =================|===============
   TASK_SHADER_NV   | Task shader
   MESH_SHADER_NV   | Mesh shader

Modify Section 7.3, Program Objects, p.89

(add to the list of reasons why LinkProgram can fail, p. 92)

* <program> contains objects to form either a mesh or task shader (see
  chapter X), and
  - the program also contains objects to form vertex, tessellation
    control, tessellation evaluation, or geometry shaders.

* <program> contains objects to form a task shader (see chapter X), and
  - the program is not separable and contains no objects to form a mesh
    shader.

Modify Section 7.3.1 Program Interfaces, p.96

(add to the list starting with VERTEX_SUBROUTINE, after GEOMETRY_SUBROUTINE)

TASK_SUBROUTINE_NV, MESH_SUBROUTINE_NV,

(add to the list starting with VERTEX_SUBROUTINE_UNIFORM, after
GEOMETRY_SUBROUTINE_UNIFORM)

TASK_SUBROUTINE_UNIFORM_NV, MESH_SUBROUTINE_UNIFORM_NV,

(add to the list of errors for GetProgramInterfaceiv, p 102,
after GEOMETRY_SUBROUTINE_UNIFORM)

TASK_SUBROUTINE_UNIFORM_NV, MESH_SUBROUTINE_UNIFORM_NV,

(modify entries for table 7.2 for GetProgramResourceiv, p. 105)

  Property                          |   Supported Interfaces
  ==================================|=================================
  ARRAY_SIZE                        | ..., TASK_SUBROUTINE_UNIFORM_NV,
                                    | MESH_SUBROUTINE_UNIFORM_NV
  ----------------------------------|-----------------------------
  NUM_COMPATIBLE_SUBROUTINES,       | ..., TASK_SUBROUTINE_UNIFORM_NV,
  COMPATIBLE_SUBROUTINES            | MESH_SUBROUTINE_UNIFORM_NV
  ----------------------------------|-----------------------------
  LOCATION                          |
  ----------------------------------|-----------------------------
  REFERENCED_BY_VERTEX_SHADER, ...  | ATOMIC_COUNTER_BUFFER, ...
  REFERENCED_BY_TASK_SHADER_NV,     |
  REFERENCED_BY_MESH_SHADER_NV      |
  ----------------------------------|-----------------------------

(add to list of the sentence starting with "For the properties
REFERENCED_BY_VERTEX_SHADER", after REFERENCED_BY_GEOMETRY_SHADER, p. 108)

REFERENCED_BY_TASK_SHADER_NV, REFERENCED_BY_MESH_SHADER_NV

(for the description of GetProgramResourceLocation and
GetProgramResourceLocationIndex, add to the list of the sentence
starting with "For GetProgramResourceLocation, programInterface must
be one of UNIFORM,", after GEOMETRY_SUBROUTINE_UNIFORM, p. 114)

TASK_SUBROUTINE_UNIFORM_NV, MESH_SUBROUTINE_UNIFORM_NV,

Modify Section 7.4, Program Pipeline Objects, p. 115

(modify the first paragraph, p. 118, to add new shader stage bits for mesh
 and task shaders)

The bits set in <stages> indicate the program stages for which the program
object named by <program> becomes current.  These stages may include
compute, vertex, tessellation control, tessellation evaluation, geometry,
fragment, mesh, and task shaders, indicated respectively by
COMPUTE_SHADER_BIT, VERTEX_SHADER_BIT, TESS_CONTROL_SHADER_BIT,
TESS_EVALUATION_SHADER_BIT, GEOMETRY_SHADER_BIT, FRAGMENT_SHADER_BIT,
MESH_SHADER_BIT_NV, and TASK_SHADER_BIT_NV, respectively.  The constant
ALL_SHADER_BITS indicates <program> is to be made current for all shader
stages.

(modify the first error in "Errors" for UseProgramStages, p. 118 to allow
 the use of mesh and task shader bits)

  An INVALID_VALUE error is generated if stages is not the special value
  ALL_SHADER_BITS, and has any bits set other than VERTEX_SHADER_BIT,
  COMPUTE_SHADER_BIT, TESS_CONTROL_SHADER_BIT, TESS_EVALUATION_SHADER_BIT,
  GEOMETRY_SHADER_BIT, FRAGMENT_SHADER_BIT, MESH_SHADER_BIT_NV, and
  TASK_SHADER_BIT_NV.


Modify Section 7.6, Uniform Variables, p. 125

(add entries to table 7.4, p. 126)

  Shader Stage         | pname for querying default uniform
                       | block storage, in components
  =====================|=====================================
  Task (see chapter X) | MAX_TASK_UNIFORM_COMPONENTS_NV
  Mesh (see chapter X) | MAX_MESH_UNIFORM_COMPONENTS_NV

(add entries to table 7.5, p. 127)

  Shader Stage         | pname for querying combined uniform
                       | block storage, in components
  =====================|========================================
  Task (see chapter X) | MAX_COMBINED_TASK_UNIFORM_COMPONENTS_NV
  Mesh (see chapter X) | MAX_COMBINED_MESH_UNIFORM_COMPONENTS_NV

(add entries to table 7.7, p. 131)

  pname                                      | prop
  ===========================================|=============================
  UNIFORM_BLOCK_REFERENCED_BY_TASK_SHADER_NV | REFERENCED_BY_TASK_SHADER_NV
  UNIFORM_BLOCK_REFERENCED_BY_MESH_SHADER_NV | REFERENCED_BY_MESH_SHADER_NV

(add entries to table 7.8, p. 132)

  pname                                      | prop
  ===========================================|=============================
  ATOMIC_COUNTER_BUFFER_REFERENCED_-         | REFERENCED_BY_TASK_SHADER_NV
  BY_TASK_SHADER_NV                          |
  -------------------------------------------|-----------------------------
  ATOMIC_COUNTER_BUFFER_REFERENCED_-         | REFERENCED_BY_MESH_SHADER_NV
  BY_MESH_SHADER_NV                          |

(modify the sentence starting with "The limits for vertex" in 7.6.2
Uniform Blocks, p. 136)
... geometry, task, mesh, fragment...
MAX_GEOMETRY_UNIFORM_BLOCKS, MAX_TASK_UNIFORM_BLOCKS_NV, MAX_MESH_UNIFORM_-
BLOCKS_NV, MAX_FRAGMENT_UNIFORM_BLOCKS...

(modify the sentence starting with "The limits for vertex", in
7.7 Atomic Counter Buffers, p. 141)

... geometry, task, mesh, fragment...
MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS, MAX_TASK_ATOMIC_COUNTER_BUFFERS_NV,
MAX_MESH_ATOMIC_COUNTER_BUFFERS_NV, MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS, ...


Modify Section 7.8 Shader Buffer Variables and Shader Storage Blocks, p. 142

(modify the sentences starting with "The limits for vertex", p. 143)

... geometry, task, mesh, fragment...
MAX_GEOMETRY_SHADER_STORAGE_BLOCKS, MAX_TASK_SHADER_STORAGE_BLOCKS_NV,
MAX_MESH_SHADER_STORAGE_BLOCKS_NV, MAX_FRAGMENT_SHADER_STORAGE_BLOCKS,...

Modify Section 7.9 Subroutine Uniform Variables, p. 144

(modify table 7.9, p. 145)

  Interface           | Shader Type
  ====================|===============
  TASK_SUBROUTINE_NV  | TASK_SHADER_NV
  MESH_SUBROUTINE_NV  | MESH_SHADER_NV

(modify table 7.10, p. 146)

  Interface                   | Shader Type
  ============================|===============
  TASK_SUBROUTINE_UNIFORM_NV  | TASK_SHADER_NV
  MESH_SUBROUTINE_UNIFORM_NV  | MESH_SHADER_NV


Modify Section 7.13 Shader, Program, and Program Pipeline Queries, p. 157

(add to the list of queries for GetProgramiv, p. 157)

  If <pname> is TASK_WORK_GROUP_SIZE_NV, an array of three integers
containing the local work group size of the task shader
(see chapter X), as specified by its input layout qualifier(s), is returned.
  If <pname> is MESH_WORK_GROUP_SIZE_NV, an array of three integers
containing the local work group size of the mesh shader
(see chapter X), as specified by its input layout qualifier(s), is returned.
  If <pname> is MESH_VERTICES_OUT_NV, the maximum number of vertices the
mesh shader (see chapter X) will output is returned.
  If <pname> is MESH_PRIMITIVES_OUT_NV, the maximum number of primitives
the mesh shader (see chapter X) will output is returned.
  If <pname> is MESH_OUTPUT_TYPE_NV, the mesh shader output type,
which must be one of POINTS, LINES or TRIANGLES, is returned.

(add to the list of errors for GetProgramiv, p. 159)

  An INVALID_OPERATION error is generated if TASK_WORK_-
GROUP_SIZE is queried for a program which has not been linked successfully,
or which does not contain objects to form a task shader.
  An INVALID_OPERATION error is generated if MESH_VERTICES_OUT_NV,
MESH_PRIMITIVES_OUT_NV, MESH_OUTPUT_TYPE_NV, or MESH_WORK_GROUP_SIZE_NV
are queried for a program which has not been linked
successfully, or which does not contain objects to form a mesh shader.


Add new language extending the edits to Section 9.2.8 (Attaching Textures
to a Framebuffer) from the OVR_multiview extension that describe how
various drawing commands are processed for when multiview rendering is
enabled:

When multiview rendering is enabled, the DrawMeshTasks* commands (section
X.6) will not spawn separate task and mesh shader invocations for each
view.  Instead, the primitives produced by each mesh shader local work
group will be processed separately for each view.  For per-vertex and
per-primitive mesh shader outputs not qualified with "perviewNV", the
single value written for each vertex or primitive will be used for the
output when processing each view.  For mesh shader outputs qualified with
"perviewNV", the output is arrayed and the mesh shader is responsible for
writing separate values for each view.  When processing output primitives
for a view numbered <V>, outputs qualified with "perviewNV" will assume
the values for array element <V>.


Modify Section 10.3.11 Indirect Commands in Buffer Objects, p. 400

(after "and to DispatchComputeIndirect (see section 19)" add)

and to DrawMeshTasksIndirectNV, MultiDrawMeshTasksIndirectNV,
MultiDrawMeshTasksIndirectCountNV (see chapter X)

(add following entries to the table 10.7)

  Indirect Command Name               | Indirect Buffer target
  ====================================|========================
  DrawMeshTasksIndirectNV             | DRAW_INDIRECT_BUFFER
  MultiDrawMeshTasksIndirectNV        | DRAW_INDIRECT_BUFFER
  MultiDrawMeshTasksIndirectCountNV   | DRAW_INDIRECT_BUFFER


Modify Section 11.1.3 Shader Execution, p. 437

(add after the first paragraph in section 11.1.3, p 437)

If there is an active program object present for the task or
mesh shader stages, the executable code for these
active programs is used to process incoming work groups (see
chapter X).

(add to the list of constants, 11.1.3.5 Texture Access, p. 441)

* MAX_TASK_TEXTURE_IMAGE_UNITS_NV (for task shaders)

* MAX_MESH_TEXTURE_IMAGE_UNITS_NV (for mesh shaders)

(add to the list of constants, 11.1.3.6 Atomic Counter Access, p. 443)

* MAX_TASK_ATOMIC_COUNTERS_NV (for task shaders)

* MAX_MESH_ATOMIC_COUNTERS_NV (for mesh shaders)

(add to the list of constants, 11.1.3.7 Image Access, p. 444)

* MAX_TASK_IMAGE_UNIFORMS_NV (for task shaders)

* MAX_MESH_IMAGE_UNIFORMS_NV (for mesh shaders)

(add to the list of constants, 11.1.3.8 Shader Storage Buffer Access,
 p. 444)

* MAX_TASK_SHADER_STORAGE_BLOCKS_NV (for task shaders)

* MAX_MESH_SHADER_STORAGE_BLOCKS_NV (for mesh shaders)

(modify the sentence of 11.3.10 Shader Outputs, p. 445)

A vertex and mesh shader can write to ...



Insert a new chapter X before Chapter 13, Fixed-Function Vertex
Post-Processing, p. 505

Chapter X, Programmable Mesh Processing

In addition to the programmable vertex processing pipeline described in
Chapters 10 and 11 [[compatibility profile only:  and the fixed-function
vertex processing pipeline in Chapter 12]], applications may use the mesh
pipeline to generate primitives for rasterization.  The mesh pipeline
generates a collection of meshes using the programmable task and mesh
shaders.  Task and mesh shaders are created as described in section 7.1
using a type parameter of TASK_SHADER_NV and MESH_SHADER_NV, respectively.
They are attached to and used in program objects as described in section
7.3.

Mesh and task shader workloads are formed from groups of work items called
work groups and processed by the executable code for a mesh or task shader
program.  A work group is a collection of shader invocations that execute
the same code, potentially in parallel.  An invocation within a work group
may share data with other members of the same work group through shared
variables (see section 4.3.8, "Shared Variables", of the OpenGL Shading
Language Specification) and issue memory and control barriers to
synchronize with other members of the same work group.

X.1 Task Shader Variables

Task shaders can access uniform variables belonging to the current
program object. Limits on uniform storage and methods for manipulating
uniforms are described in section 7.6.

There is a limit to the total amount of memory consumed by output
variables in a single task shader work group.  This limit, expressed in
basic machine units, may be queried by calling GetIntegerv with the value
MAX_TASK_TOTAL_MEMORY_SIZE_NV.

X.2 Task Shader Outputs

Each task shader work group can define how many mesh work groups
should be generated by writing to gl_TaskCountNV. The maximum
number can be queried by GetIntergev using MAX_TASK_OUTPUT_COUNT_NV.

Furthermore the task work group can output data (qualified with "taskNV")
that can be accessed by to the generated mesh work groups.

X.3 Mesh Shader Variables

Mesh shaders can access uniform variables belonging to the current
program object. Limits on uniform storage and methods for manipulating
uniforms are described in section 7.6.
There is a limit to the total size of all variables declared as shared
as well as output attributes in a single mesh stage. This limit, expressed
in units of basic machine units, may be queried as the value of
MAX_MESH_TOTAL_MEMORY_SIZE_NV.

X.4 Mesh Shader Inputs

When each mesh shader work group runs, its invocations have access to
built-in variables describing the work group and invocation and also the
task shader outputs (qualified with "taskNV") written the task shader that
generated the work group.  When no task shader is active, the mesh shader
has no access to task shader outputs.

X.5 Mesh Shader Outputs

When each mesh shader work group completes, it emits an output mesh
consisting of

* a primitive count, written to the built-in output gl_PrimitiveCountNV;

* a collection of vertex attributes, where each vertex in the mesh has a
  set of built-in and user-defined per-vertex output variables and blocks;

* a collection of per-primitive attributes, where each of the
  gl_PrimitiveCountNV primitives in the mesh has a set of built-in and
  user-defined per-primitive output variables and blocks; and

* an array of vertex index values written to the built-in output array
  gl_PrimitiveIndicesNV, where each output primitive has a set of one,
  two, or three indices that identify the output vertices in the mesh used
  to form the primitive.

This data is used to generate primitives of one of three types. The
supported output primitive types are points (POINTS), lines (LINES), and
triangles (TRIANGLES). The vertices output by the mesh shader are assembled
into points, lines, or triangles based on the output primitive type in the
DrawElements manner described in section 10.4, with the
gl_PrimitiveIndicesNV array content serving as index values, and the
local vertex attribute arrays as vertex arrays.

The output arrays are sized depending on the compile-time provided
values ("max_vertices" and "max_primitives"), which must be below
their appropriate maxima that can be queried via GetIntegerv and
MAX_MESH_OUTPUT_PRIMITIVES_NV as well as MAX_MESH_OUTPUT_VERTICES_NV.

The output attributes are allocated at an implementation-dependent
granularity that can be queried via MESH_OUTPUT_PER_VERTEX_GRANULARITY_NV
and MESH_OUTPUT_PER_PRIMITIVE_GRANULARITY_NV.  The total amount of memory
consumed for per-vertex and per-primitive output variables must not exceed
an implementation-dependent total memory limit that can be queried by
calling GetIntegerv with the enum MAX_MESH_TOTAL_MEMORY_SIZE_NV.  The
memory consumed by the gl_PrimitiveIndicesNV[] array does not count
against this limit.

X.6 Mesh Tasks Drawing Commands

One or more work groups is launched by calling

  void DrawMeshTasksNV( uint first, uint count );

If there is an active program object for the task shader stage,
<count> work groups are processed by the active program for the task
shader stage. If there is no active program object for the task shader
stage, <count> work groups are instead processed by the active
program for the mesh shader stage.  The active program for both shader
stages will be determined in the same manner as the active program for other
pipeline stages, as described in section 7.3. While the individual shader
invocations within a work group are executed as a unit, work groups are
executed completely independently and in unspecified order.
The x component of gl_WorkGroupID of the first active stage  will be within
the range of [<first> , <first + count - 1>]. The y and z component of
gl_WorkGroupID within all stages will be set to zero.

The maximum number of task or mesh shader work groups that
may be dispatched at one time may be determined by calling GetIntegerv
with <target> set to MAX_DRAW_MESH_TASKS_COUNT_NV.

The local work size in each dimension is specified at compile time using
an input layout qualifier in one or more of the task or mesh shaders
attached to the program; see the OpenGL Shading Language Specification for
more information.  After the program has been linked, the local work group
size of the task or mesh shader may be queried by calling GetProgramiv
with <pname> set to TASK_WORK_GROUP_SIZE_NV or MESH_WORK_GROUP_SIZE_NV, as
described in section 7.13.

The maximum size of a task or mesh shader local work group may be
determined by calling GetIntegeri_v with <target> set to
MAX_TASK_WORK_GROUP_SIZE_NV or MAX_MESH_WORK_GROUP_SIZE_NV, and <index>
set to 0, 1, or 2 to retrieve the maximum work size in the X, Y and Z
dimension, respectively.  Furthermore, the maximum number of invocations
in a single local work group (i.e., the product of the three dimensions)
may be determined by calling GetIntegerv with pname set to
MAX_TASK_WORK_GROUP_INVOCATIONS_NV or MAX_MESH_WORK_GROUP_INVOCATIONS_NV.

  Errors

    An INVALID_OPERATION error is generated if there is no active
    program for the mesh shader stage.

    An INVALID_VALUE error is generated if <count> exceeds
    MAX_DRAW_MESH_TASKS_COUNT_NV.


If there is an active program on the task shader stage, each task shader
work group writes a task count to the built-in task shader output
gl_TaskCountNV.  If this count is non-zero upon completion of the task
shader, then gl_TaskCountNV work groups are generated and processed by the
active program for the mesh shader stage.  If this count is zero, no work
groups are generated.  If the count is greater than MAX_TASK_OUTPUT_COUNT_NV
the number of mesh shader work groups generated is undefined.
The built-in variables available to the generated mesh shader work groups
are identical to those that would be generated if DrawMeshTasksNV were
called with no task shader active and with a <count> of gl_TaskCountNV.

The primitives of the mesh are then processed by the pipeline stages
described in subsequent chapters in the same manner as primitives produced
by the conventional vertex processing pipeline described in previous
chapters.

The command

  void DrawMeshTasksIndirectNV(intptr indirect);

  typedef struct {
    uint count;
    uint first;
  } DrawMeshTasksIndirectCommandNV;

is equivalent to calling DrawMeshTasksNV with the parameters sourced from a
a DrawMeshTasksIndirectCommandNV struct stored in the buffer currently
bound to the DRAW_INDIRECT_BUFFER binding at an offset, in basic machine
units, specified by <indirect>.  If the <count> read from the indirect
draw buffer is greater than MAX_DRAW_MESH_TASKS_COUNT_NV, then the results
of this command are undefined.

  Errors

    An INVALID_OPERATION error is generated if there is no active program
    for the mesh shader stage.

    An INVALID_VALUE error is generated if <indirect> is negative or is
    not a multiple of the size, in basic machine units, of uint.

    An INVALID_OPERATION error is generated if the command would source
    data beyond the end of the buffer object.

    An INVALID_OPERATION error is generated if zero is bound to the
    DRAW_INDIRECT_BUFFER binding.

The command

  void MultiDrawMeshTasksIndirectNV(intptr indirect,
                                    sizei drawcount,
                                    sizei stride);

behaves identically to DrawMeshTasksIndirectNV, except that <indirect> is
treated as an array of <drawcount> DrawMeshTasksIndirectCommandNV
structures.    <indirect> contains the offset of the first element of the
array within the buffer currently bound to the DRAW_INDIRECT buffer
binding. <stride> specifies the distance, in basic machine units, between
the elements of the array. If <stride> is zero, the array elements are
treated as tightly packed. <stride> must be a multiple of four, otherwise
an INVALID_VALUE error is generated.

<drawcount> must be positive, otherwise an INVALID_VALUE error will be
generated.

  Errors

    In addition to errors that would be generated by
    DrawMeshTasksIndirect:

    An INVALID_VALUE error is generated if <stride> is neither zero nor a
    multiple of four.

    An INVALID_VALUE error is generated if <stride> is non-zero and less
    than the size of DrawMeshTasksIndirectCommandNV.

    An INVALID_VALUE error is generated if <drawcount> is not positive.

The command

  void MultiDrawMeshTasksIndirectCountNV( intptr indirect,
                                          intptr drawcount,
                                          sizei maxdrawcount,
                                          sizei stride);

behaves similarly to MultiDrawMeshTasksIndirectNV, except that <drawcount>
defines an offset (in bytes) into the buffer object bound to the
PARAMETER_BUFFER_ARB binding point at which a single <sizei> typed value
is stored, which contains the draw count. <maxdrawcount> specifies the
maximum number of draws that are expected to be stored in the buffer.
If the value stored at <drawcount> into the buffer is greater than
<maxdrawcount>, an implementation stop processing draws after
<maxdrawcount> parameter sets.

  Errors

    In addition to errors that would be generated by
    MultiDrawMeshTasksIndirectNV:

    An INVALID_OPERATION error is generated if no buffer is bound to the
    PARAMETER_BUFFER binding point.

    An INVALID_VALUE error is generated if <drawcount> (the offset of the
    memory holding the actual draw count) is not a multiple of four.

    An INVALID_OPERATION error is generated if reading a sizei typed value
    from the buffer bound to the PARAMETER_BUFFER target at the offset
    specified by drawcount would result in an out-of-bounds access.

New Implementation Dependent State

Add to Table 23.43, "Program Object State"

+----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+
| Get Value                                          | Type      | Get Command             | Initial Value | Description                                            | Sec.    |
+----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+
| TASK_WORK_GROUP_SIZE_NV                            | 3 x Z+    | GetProgramiv            | { 0, ... }    | Local work size of a linked mesh stage                 | 7.13    |
| MESH_WORK_GROUP_SIZE_NV                            | 3 x Z+    | GetProgramiv            | { 0, ... }    | Local work size of a linked task stage                 | 7.13    |
| MESH_VERTICES_OUT_NV                               | Z+        | GetProgramiv            | 0             | max_vertices size of a linked mesh stage               | 7.13    |
| MESH_PRIMITIVES_OUT_NV                             | Z+        | GetProgramiv            | 0             | max_primitives size of a linked mesh stage             | 7.13    |
| MESH_OUTPUT_TYPE_NV                                | Z+        | GetProgramiv            | POINTS        | Primitive output type of a linked mesh stage           | 7.13    |
| UNIFORM_BLOCK_REFERENCED_BY_TASK_SHADER_NV         | B         | GetActiveUniformBlockiv | FALSE         | True if uniform block is referenced by the task stage  | 7.6.2   |
| UNIFORM_BLOCK_REFERENCED_BY_MESH_SHADER_NV         | B         | GetActiveUniformBlockiv | FALSE         | True if uniform block is referenced by the mesh stage  | 7.6.2   |
| ATOMIC_COUNTER_BUFFER_REFERENCED_BY_TASK_SHADER_NV | B         | GetActiveAtomicCounter- | FALSE         | AACB has a counter used by task shaders                | 7.7     |
|                                                    |           | Bufferiv                |               |                                                        |         |
| ATOMIC_COUNTER_BUFFER_REFERENCED_BY_MESH_SHADER_NV | B         | GetActiveAtomicCounter- | FALSE         | AACB has a counter used by mesh shaders                | 7.7     |
|                                                    |           | Bufferiv                |               |                                                        |         |
+----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+

Add to Table 23.53, "Program Object Resource State"

+----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+
| Get Value                                          | Type      | Get Command             | Initial Value | Description                                            | Sec.    |
+----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+
| REFERENCED_BY_TASK_SHADER_NV                       | Z+        | GetProgramResourceiv    | -             | Active resource used by task shader                    |  7.3.1  |
| REFERENCED_BY_MESH_SHADER_NV                       | Z+        | GetProgramResourceiv    | -             | Active resource used by mesh shader                    |  7.3.1  |
+----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+

Add to Table 23.67, "Implementation Dependent Values"

+------------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+--------+
| Get Value                                | Type      | Get Command   | Minimum Value       | Description                                                           | Sec.   |
+------------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+--------+
| MAX_DRAW_MESH_TASKS_COUNT_NV             | Z+        | GetIntegerv   | 2^16 - 1            | Maximum number of work groups that may be drawn by a single           | X.6    |
|                                          |           |               |                     | draw mesh tasks command                                               |        |
| MESH_OUTPUT_PER_VERTEX_GRANULARITY_NV    | Z+        | GetIntegerv   | -                   | Per-vertex output allocation granularity for mesh shaders             | X.3    |
| MESH_OUTPUT_PER_PRIMITIVE_GRANULARITY_NV | Z+        | GetIntegerv   | -                   | Per-primitive output allocation granularity for mesh shaders          | X.3    |
+------------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+--------+

Insert Table 23.75, "Implementation Dependent Task Shader Limits"

+-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+
| Get Value                               | Type      | Get Command   | Minimum Value       | Description                                                           | Sec.     |
+-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+
| MAX_TASK_WORK_GROUP_SIZE_NV             | 3 x Z+    | GetIntegeri_v | 32     (x), 1 (y,z) | Maximum local size of a task work group (per dimension)               | X.6      |
| MAX_TASK_WORK_GROUP_INVOCATIONS_NV      | Z+        | GetIntegerv   | 32                  | Maximum total task shader invocations in a single local work group    | X.6      |
| MAX_TASK_UNIFORM_BLOCKS_NV              | Z+        | GetIntegerv   | 12                  | Maximum number of uniform blocks per task program                     | 7.6.2    |
| MAX_TASK_TEXTURE_IMAGE_UNITS_NV         | Z+        | GetIntegerv   | 16                  | Maximum number of texture image units accessible by a task program    | 11.1.3.5 |
| MAX_TASK_ATOMIC_COUNTER_BUFFERS_NV      | Z+        | GetIntegerv   | 8                   | Number of atomic counter buffers accessed by a task program           | 7.7      |
| MAX_TASK_ATOMIC_COUNTERS_NV             | Z+        | GetIntegerv   | 8                   | Number of atomic counters accessed by a task program                  | 11.1.3.6 |
| MAX_TASK_IMAGE_UNIFORMS_NV              | Z+        | GetIntegerv   | 8                   | Number of image variables in task program                             | 11.1.3.7 |
| MAX_TASK_SHADER_STORAGE_BLOCKS_NV       | Z+        | GetIntegerv   | 12                  | Maximum number of storage buffer blocks per task program              | 7.8      |
| MAX_TASK_UNIFORM_COMPONENTS_NV          | Z+        | GetIntegerv   | 512                 | Number of components for task shader uniform variables                | 7.6      |
| MAX_COMBINED_TASK_UNIFORM_COMPONENTS_NV | Z+        | GetIntegerv   | *                   | Number of words for task shader uniform variables in all uniform      | 7.6      |
|                                         |           |               |                     | blocks, including the default                                         |          |
| MAX_TASK_TOTAL_MEMORY_SIZE_NV           | Z+        | GetIntegerv   | 16384               | Maximum total storage size of all variables declared as <shared> and  | X.1      |
|                                         |           |               |                     | <out> in all task shaders linked into a single program object         |          |
| MAX_TASK_OUTPUT_COUNT_NV                | Z+        | GetIntegerv   | 65535               | Maximum number of child mesh work groups a single task shader         | X.2      |
|                                         |           |               |                     | work group can emit                                                   |          |
+-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+

Insert Table 23.76, "Implementation Dependent Mesh Shader Limits",
renumber subsequent tables.

+-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+
| Get Value                               | Type      | Get Command   | Minimum Value       | Description                                                           | Sec.     |
+-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+
| MAX_MESH_WORK_GROUP_SIZE_NV             | 3 x Z+    | GetIntegeri_v | 32     (x), 1 (y,z) | Maximum local size of a mesh work group (per dimension)               | X.6      |
| MAX_MESH_WORK_GROUP_INVOCATIONS_NV      | Z+        | GetIntegerv   | 32                  | Maximum total mesh shader invocations in a single local work group    | X.6      |
| MAX_MESH_UNIFORM_BLOCKS_NV              | Z+        | GetIntegerv   | 12                  | Maximum number of uniform blocks per mesh program                     | 7.6.2    |
| MAX_MESH_TEXTURE_IMAGE_UNITS_NV         | Z+        | GetIntegerv   | 16                  | Maximum number of texture image units accessible by a mesh shader     | 11.1.3.5 |
| MAX_MESH_ATOMIC_COUNTER_BUFFERS_NV      | Z+        | GetIntegerv   | 8                   | Number of atomic counter buffers accessed by a mesh shader            | 7.7      |
| MAX_MESH_ATOMIC_COUNTERS_NV             | Z+        | GetIntegerv   | 8                   | Number of atomic counters accessed by a mesh shader                   | 11.1.3.6 |
| MAX_MESH_IMAGE_UNIFORMS_NV              | Z+        | GetIntegerv   | 8                   | Number of image variables in mesh shaders                             | 11.1.3.7 |
| MAX_MESH_SHADER_STORAGE_BLOCKS_NV       | Z+        | GetIntegerv   | 12                  | Maximum number of storage buffer blocks per task program              | 7.8      |
| MAX_MESH_UNIFORM_COMPONENTS_NV          | Z+        | GetIntegerv   | 512                 | Number of components for mesh shader uniform variables                | 7.6      |
| MAX_COMBINED_MESH_UNIFORM_COMPONENTS_NV | Z+        | GetIntegerv   | *                   | Number of words for mesh shader uniform variables in all uniform      | 7.6      |
|                                         |           |               |                     | blocks, including the default                                         |          |
| MAX_MESH_TOTAL_MEMORY_SIZE_NV           | Z+        | GetIntegerv   | 16384               | Maximum total storage size of all variables declared as <shared> and  | X.3      |
|                                         |           |               |                     | <out> in all mesh shaders linked into a single program object         |          |
| MAX_MESH_OUTPUT_PRIMITIVES_NV           | Z+        | GetIntegerv   | 256                 | Maximum number of primitives a single mesh work group can emit        | X.5      |
| MAX_MESH_OUTPUT_VERTICES_NV             | Z+        | GetIntegerv   | 256                 | Maximum number of vertices a single mesh work group can emit          | X.5      |
| MAX_MESH_VIEWS_NV                       | Z+        | GetIntegerv   | 1                   | Maximum number of multi-view views that can be used in a mesh shader  |          |
+-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+

Interactions with ARB_indirect_parameters and OpenGL 4.6

If none of ARB_indirect_parameters or OpenGL 4.6 are supported, remove the
MultiDrawMeshTasksIndirectCountNV function.

Interactions with NV_command_list

Modify the subsection 10.X.1 State Objects

(add after the first paragraph of the description of the StateCaptureNV
command)

When programs with active mesh or task stages are used, the
base primitive mode must be set to GL_POINTS.

(add to the list of errors)

INVALID_OPERATION is generated if <basicmode> is not GL_POINTS
when the mesh or task shaders are active.

Modify subsection 10.X.2 Drawing with Commands

(add a new paragraph before "None of the commands called by")

When mesh or task shaders are active the DRAW_ARRAYS_COMMAND_NV
must be used to draw mesh tasks. The fields of the
DrawArraysCommandNV will be interpreted as follows:

  DrawMeshTasksNV(cmd->first, cmd->count);

Interactions with ARB_draw_indirect and NV_vertex_buffer_unified_memory

When the ARB_draw_indirect and NV_vertex_buffer_unified_memory extensions
are supported, applications can enable DRAW_INDIRECT_UNIFIED_NV to specify
that indirect draw data are sourced from a pre-programmed memory range.  For
such implementations, we add a paragraph to spec language for
DrawMeshTasksIndirectNV, also inherited by MultiDrawMeshTasksIndirectNV and
MultiDrawMeshTasksIndirectCountNV:

    While DRAW_INDIRECT_UNIFIED_NV is enabled, DrawMeshTasksIndirectNV
    sources its arguments from the address specified by the command
    BufferAddressRange where <pname> is DRAW_INDIRECT_ADDRESS_NV and
    <index> is zero, added to the <indirect> parameter.   If the draw
    indirect address range does not belong to a buffer object that is
    resident at the time of the Draw, undefined results, possibly
    including program termination, may occur.

Additionally, the errors specified for DRAW_INDIRECT_BUFFER accesses for
DrawMeshTasksIndirectNV are modified as follows:

    An INVALID_OPERATION error is generated if DRAW_INDIRECT_UNIFIED_NV is
    disabled and zero is bound to the DRAW_INDIRECT_BUFFER binding.

    An INVALID_OPERATION error is generated if DRAW_INDIRECT_UNIFIED_NV is
    disabled and the command would source data beyond the end of the
    DRAW_INDIRECT_BUFFER binding.

    An INVALID_OPERATION error is generated if DRAW_INDIRECT_UNIFIED_NV is
    enabled and the command would source data beyond the end of the
    DRAW_INDIRECT_ADDRESS_NV buffer address range.

Interactions with OVR_multiview

Modify the new section "9.2.2.2 (Multiview Images)"

(insert a new entry to the list following
 "In this mode there are several restrictions:")

 - in mesh shaders only the appropriate per-view outputs are
   used.

Interactions with OpenGL ES 3.2

If implemented in OpenGL ES, remove all references to
MESH_SUBROUTINE_NV, TASK_SUBROUTINE_NV, MESH_SUBROUTINE_UNIFORM_NV,
TASK_SUBROUTINE_UNIFORM_NV,
ATOMIC_COUNTER_BUFFER_REFERENCED_BY_MESH_SHADER_NV,
ATOMIC_COUNTER_BUFFER_REFERENCED_BY_TASK_SHADER_NV, GetDoublev, GetDoublei_v
and MultiDrawMeshTasksIndirectCountNV.

Modify Section 7.3, Program Objects, p. 71 ES 3.2

(replace the reason why LinkProgram can fail with "program contains objects
to form either a vertex shader or fragment shader", p. 73 ES 3.2)

* <program> contains objects to form either a vertex shader or fragment
  shader but not a mesh shader, and

  - <program> is not separable, and does not contain objects to form both a
    vertex shader and fragment shader.

(add to the list of reasons why LinkProgram can fail, p. 74 ES 3.2)

* program contains objects to form either a mesh or task shader (see
  chapter X) but no fragment shader.

Issues

(1) Should we use a new command to specify work to be processed by task
    and mesh shaders?

  RESOLVED:  Yes.  Using a separate draw call helps to clearly
  differentiate task and mesh shader processing for the existing vertex
  processing performed by the standard OpenGL vertex processing pipeline
  with its vertex, tessellation, and geometry shaders.

(2) What name should we use for the draw calls that spawn task and mesh
shaders?

  RESOLVED:  For basic draws, we use the following command:

    void DrawMeshTasksNV(uint first, uint count);

  The first <first> and <count> parameters specifying a range of mesh task
  numbers to process by the task and/or mesh shaders.

  Since the programming model of mesh and task shaders is very similar to
  that of compute shaders, we considered using an interface similar to
  DispatchCompute(), such as:

    void DrawWorkGroupsNV(uint num_groups_x, uint num_groups_y,
                          uint num_groups_z);

  We ultimately decided to not use such a generic name.  It might be
  useful in the future to give compute shaders the ability to spawn
  "draws" in the future, and it's not clear that the programming model for
  such a design would look anything like mesh and task shaders.

  The existing graphics draw calls DrawArrays() and DrawElements()
  directly or indirectly refer to elements of a vertex array.  Since the
  programming model here spawns generic work that ultimately produces a
  set of (likely connected) output primitives, we use the word "mesh" to
  refer to the output of this pipeline and "tasks" to refer to the fact
  that the draw call is spawning generic work groups to produce such these
  "meshes".

  NOTE:  In order to minimize divergence from the programming model for
  compute shaders, mesh shaders use the same three-dimensional local work
  group concept used by compute shaders.  However, the hardware used for
  task and mesh shaders is more limited and supports only one-dimensional
  work groups.  We decided to only use one "dimension" in the draw call to
  keep the API simple and reflect the limitation.

(3) Should we be able to dispatch a range of work groups that doesn't
    start at zero?

  RESOLVED:  Yes.  When porting application code from using regular vertex
  processing to mesh shader processing, the use of an implicit offset via
  the <first> parameter should be helpful as it is in standard DrawArrays
  calls.  We think it's likely that applications will store information
  about tasks to process in a single array with global task numbers.  In
  this case, the draw call with an offset allows applications to specify a
  range of this array of tasks to process.

(4) Should we support separable program objects with mesh and task
    shaders, where one program provides a task shader and a second
    program provides a mesh shader that interfaces with it?

  RESOLVED:  Yes.  Supporting separable program objects is not difficult
  and may be useful in some cases.  For example, one might use a single
  task shader that could be used for common processing of different types
  of geometry (e.g., evaluating visibililty via a bounding box) while
  using different mesh shaders to generate different types of primitives.

(5) Should we have queryable limits on the total amount of output memory
    consumed by mesh or task shaders?

  RESOLVED:  Yes.  We have implementation-dependent limits on the total
  amount of output memory consumed by mesh and task shaders that can be
  queried using MAX_MESH_TOTAL_MEMORY_SIZE_NV and
  MAX_TASK_TOTAL_MEMORY_SIZE_NV.  For each per-vertex or per-primitive
  output attribute in a mesh shader, memory is allocated separately for
  each vertex or primitive allocated by the shader.  The total number of
  vertices or primitives used for this allocation is determined by taking
  the maximum vertex and primitive counts declared in the mesh shader and
  padding to implementation-dependent granularities that can be queried
  using MESH_OUTPUT_PER_VERTEX_GRANULARITY_NV and
  MESH_OUTPUT_PER_PRIMITIVE_GRANULARITY_NV.

(6) Should we have any MultiDrawMeshTasksIndirectNV, to draw
    multiple sets of mesh tasks in one call?

  RESOLVED:   Yes, we support "multi-draw" APIs to for consistency with
  the standard vertex pipeline.  When using these APIs, each individual
  "draw" has its own structure stored in a buffer object.  If mesh or task
  shaders need to determine which draw is being processed, the built-in
  gl_DrawIDARB can be used for that purpose.

(7) Do we support transform feedback with mesh shaders?

  RESOLVED:  No.  In the initial implementation of this extension, the
  hardware doesn't support it.

(8) When using multi-view (OVR_multiview), how do we broadcast the
    primitive to multiple layers or viewports?

  RESOLVED:  When the OVR_multiview extension is enabled in a vertex
  shader, the layout qualifier:

      layout(num_views = 2) in;

  indicates that the vertex shader should be run separately for two views,
  where the shader can use the built-in input gl_ViewIDOVR to determine
  which view is being processed.  A separate set of primitives is
  generated for each view, and each is rasterized into a separate
  framebuffer layer.

  When the "num_views" layout qualifier for the OVR_multiview extension is
  enabled in a mesh shader, the semantics are slightly different.  Instead
  of running a separate mesh shader invocation for each view, a single
  invocation is generated to process all views.  The view count from the
  layout qualifier indicates the size of the extra array dimension for
  "arrayed" per-vertex and per-primitive outputs qualified with
  "perviewNV".  The set of primitives generated by the mesh shader will be
  broadcast separately to each view.  For per-vertex or per-primitive
  outputs not qualified with "perviewNV", the single value written by the
  mesh shader for each vertex/primitive will be used for each view.  For
  outputs qualified with "perviewNV", each view will use a separate value
  from the corresponding "arrayed" output.

(9) Should we support NV_gpu_program5-style assembly programs for mesh
    and task shaders?

  RESOLVED:  No.  We do provide a GLSL extension, also called
  "GL_NV_mesh_shader".

Also, please refer to issues in the GLSL extension specification.

Revision History

Revision 5 (pdaniell)
- Fix minimum implementation limit of MAX_DRAW_MESH_TASKS_COUNT_NV.

Revision 4 (pknowles)
- Add ES interactions.

Revision 3, January 14, 2019 (pbrown)
- Fix a typo in language prohibiting use of a task shader without a mesh
  shader.

Revision 2, September 17, 2018 (pbrown)
- Prepare specification for publication.

Revision 1 (ckubsich)
- Internal revisions.