xenocara/lib/mesa/docs/specs/INTEL_shader_atomic_float_minmax.txt

Name

    INTEL_shader_atomic_float_minmax

Name Strings

    GL_INTEL_shader_atomic_float_minmax

Contact

    Ian Romanick (ian . d . romanick 'at' intel . com)

Contributors


Status

    In progress

Version

    Last Modified Date: 06/22/2018
    Revision: 4

Number

    TBD

Dependencies

    OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or
    ARB_compute_shader is required.

    This extension is written against version 4.60 of the OpenGL Shading
    Language Specification.

Overview

    This extension provides GLSL built-in functions allowing shaders to
    perform atomic read-modify-write operations to floating-point buffer
    variables and shared variables.  Minimum, maximum, exchange, and
    compare-and-swap are enabled.


New Procedures and Functions

    None.

New Tokens

    None.

IP Status

    None.

Modifications to the OpenGL Shading Language Specification, Version 4.60

    Including the following line in a shader can be used to control the
    language features described in this extension:

      #extension GL_INTEL_shader_atomic_float_minmax : <behavior>

    where <behavior> is as specified in section 3.3.

    New preprocessor #defines are added to the OpenGL Shading Language:

      #define GL_INTEL_shader_atomic_float_minmax   1

Additions to Chapter 8 of the OpenGL Shading Language Specification
(Built-in Functions)

    Modify Section 8.11, "Atomic Memory Functions"

    (add a new row after the existing "atomicMin" table row, p. 179)

        float atomicMin(inout float mem, float data)


        Computes a new value by taking the minimum of the value of data and
        the contents of mem.  If one of these is an IEEE signaling NaN (i.e.,
        a NaN with the most-significant bit of the mantissa cleared), it is
        always considered smaller.  If one of these is an IEEE quiet NaN
        (i.e., a NaN with the most-significant bit of the mantissa set), it is
        always considered larger.  If both are IEEE quiet NaNs or both are
        IEEE signaling NaNs, the result of the comparison is undefined.

    (add a new row after the exiting "atomicMax" table row, p. 179)

        float atomicMax(inout float mem, float data)

        Computes a new value by taking the maximum of the value of data and
        the contents of mem.  If one of these is an IEEE signaling NaN (i.e.,
        a NaN with the most-significant bit of the mantissa cleared), it is
        always considered larger.  If one of these is an IEEE quiet NaN (i.e.,
        a NaN with the most-significant bit of the mantissa set), it is always
        considered smaller.  If both are IEEE quiet NaNs or both are IEEE
        signaling NaNs, the result of the comparison is undefined.

    (add to "atomicExchange" table cell, p. 180)

        float atomicExchange(inout float mem, float data)

    (add to "atomicCompSwap" table cell, p. 180)

        float atomicCompSwap(inout float mem, float compare, float data)

Interactions with OpenGL 4.6 and ARB_gl_spirv

    If OpenGL 4.6 or ARB_gl_spirv is supported, then
    SPV_INTEL_shader_atomic_float_minmax must also be supported.

    The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or
    OpenGL ES implementation supports INTEL_shader_atomic_float_minmax.

Issues

    1) Why call this extension INTEL_shader_atomic_float_minmax?

    RESOLVED: Several other extensions already set the precedent of
    VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions
    that enable floating-point atomic operations.  Using that as a base for
    the name seems logical.

    There already exists NV_shader_atomic_float, but the two extensions have
    nearly zero overlap in functionality.  NV_shader_atomic_float adds
    atomicAdd and image atomic operations that currently shipping Intel GPUs
    do not support.  Calling this extension INTEL_shader_atomic_float would
    likely have been confusing.

    Adding something to describe the actual functions added by this extension
    seemed reasonable.  INTEL_shader_atomic_float_compare was considered, but
    that name was deemed to be not properly descriptive.  Calling this
    extension INTEL_shader_atomic_float_min_max_exchange_compswap is right
    out.

    2) What atomic operations should we support for floating-point targets?

    RESOLVED.  Exchange, min, max, and compare-swap make sense, and these are
    all supported by the hardware.  Future extensions may add other functions.

    For buffer variables and shared variables it is not possible to bit-cast
    the memory location in GLSL, so existing integer operations, such as
    atomicOr, cannot be used.  However, the underlying hardware implementation
    can do this by treating the memory as an integer.  It would be possible to
    implement atomicNegate using this technique with atomicXor.  It is unclear
    whether this provides any actual utility.

    3) What should be said about the NaN behavior?

    RESOLVED.  There are several aspects of NaN behavior that should be
    documented in this extension.  However, some of this behavior varies based
    on NaN concepts that do not exist in the GLSL specification.

    * atomicCompSwap performs the comparison as the floating-point equality
      operator (==).  That is, if either 'mem' or 'compare' is NaN, the
      comparison result is always false.

    * atomicMin and atomicMax implement the IEEE specification with respect to
      NaN.  IEEE considers two different kinds of NaN: signaling NaN and quiet
      NaN.  A quiet NaN has the most significant bit of the mantissa set, and
      a signaling NaN does not.  This concept does not exist in SPIR-V,
      Vulkan, or OpenGL.  Let qNaN denote a quiet NaN and sNaN denote a
      signaling NaN.  atomicMin and atomicMax specifically implement

      - fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x
      - fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN
      - fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) =
        fmax(qNaN, sNaN) = sNaN
      - fmin(sNaN, sNaN) = sNaN.  This specification does not define which of
        the two arguments is stored.
      - fmax(sNaN, sNaN) = sNaN.  This specification does not define which of
        the two arguments is stored.
      - fmin(qNaN, qNaN) = qNaN.  This specification does not define which of
        the two arguments is stored.
      - fmax(qNaN, qNaN) = qNaN.  This specification does not define which of
        the two arguments is stored.

    Further details are available in the Skylake Programmer's Reference
    Manuals available at
    https://01.org/linuxgraphics/documentation/hardware-specification-prms.

    4) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0)
    arguments?

    RESOLVED.  atomicMin should store -0.0, and atomicMax should store +0.0.
    Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is
    stored.  This behavior may change in later GPUs.

Revision History

    Rev  Date        Author    Changes
    ---  ----------  --------  ---------------------------------------------
      1  04/19/2018  idr       Initial version
      2  05/05/2018  idr       Describe interactions with the capabilities
                               added by SPV_INTEL_shader_atomic_float_minmax.
      3  05/29/2018  idr       Remove mention of 64-bit float support.
      4  06/22/2018  idr       Resolve issue #2.
                               Add issue #3 (regarding NaN behavior).
                               Add issue #4 (regarding atomicMin(-0, +0).