342 lines
9.2 KiB
HTML
342 lines
9.2 KiB
HTML
<HTML>
|
|
|
|
<TITLE>Shading Language Support</TITLE>
|
|
|
|
<link rel="stylesheet" type="text/css" href="mesa.css"></head>
|
|
|
|
<BODY>
|
|
|
|
<H1>Shading Language Support</H1>
|
|
|
|
<p>
|
|
This page describes the features and status of Mesa's support for the
|
|
<a href="http://opengl.org/documentation/glsl/" target="_parent">
|
|
OpenGL Shading Language</a>.
|
|
</p>
|
|
|
|
<p>
|
|
Last updated on 15 December 2008.
|
|
</p>
|
|
|
|
<p>
|
|
Contents
|
|
</p>
|
|
<ul>
|
|
<li><a href="#120">GLSL 1.20 support</a>
|
|
<li><a href="#unsup">Unsupported Features</a>
|
|
<li><a href="#notes">Implementation Notes</a>
|
|
<li><a href="#hints">Programming Hints</a>
|
|
<li><a href="#standalone">Stand-alone GLSL Compiler</a>
|
|
<li><a href="#implementation">Compiler Implementation</a>
|
|
<li><a href="#validation">Compiler Validation</a>
|
|
</ul>
|
|
|
|
|
|
|
|
<a name="120">
|
|
<h2>GLSL 1.20 support</h2>
|
|
|
|
<p>
|
|
GLSL version 1.20 is supported in Mesa 7.3.
|
|
Among the features/differences of GLSL 1.20 are:
|
|
<ul>
|
|
<li><code>mat2x3, mat2x4</code>, etc. types and functions
|
|
<li><code>transpose(), outerProduct(), matrixCompMult()</code> functions
|
|
(but untested)
|
|
<li>precision qualifiers (lowp, mediump, highp)
|
|
<li><code>invariant</code> qualifier
|
|
<li><code>array.length()</code> method
|
|
<li><code>float[5] a;</code> array syntax
|
|
<li><code>centroid</code> qualifier
|
|
<li>unsized array constructors
|
|
<li>initializers for uniforms
|
|
<li>const initializers calling built-in functions
|
|
</ul>
|
|
|
|
|
|
|
|
<a name="unsup">
|
|
<h2>Unsupported Features</h2>
|
|
|
|
<p>
|
|
The following features of the shading language are not yet supported
|
|
in Mesa:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>Linking of multiple shaders is not supported
|
|
<li>gl_ClipVertex
|
|
<li>The gl_Color and gl_SecondaryColor varying vars are interpolated
|
|
without perspective correction
|
|
</ul>
|
|
|
|
<p>
|
|
All other major features of the shading language should function.
|
|
</p>
|
|
|
|
|
|
<a name="notes">
|
|
<h2>Implementation Notes</h2>
|
|
|
|
<ul>
|
|
<li>Shading language programs are compiled into low-level programs
|
|
very similar to those of GL_ARB_vertex/fragment_program.
|
|
<li>All vector types (vec2, vec3, vec4, bvec2, etc) currently occupy full
|
|
float[4] registers.
|
|
<li>Float constants and variables are packed so that up to four floats
|
|
can occupy one program parameter/register.
|
|
<li>All function calls are inlined.
|
|
<li>Shaders which use too many registers will not compile.
|
|
<li>The quality of generated code is pretty good, register usage is fair.
|
|
<li>Shader error detection and reporting of errors (InfoLog) is not
|
|
very good yet.
|
|
<li>The ftransform() function doesn't necessarily match the results of
|
|
fixed-function transformation.
|
|
</ul>
|
|
|
|
<p>
|
|
These issues will be addressed/resolved in the future.
|
|
</p>
|
|
|
|
|
|
<a name="hints">
|
|
<h2>Programming Hints</h2>
|
|
|
|
<ul>
|
|
<li>Declare <em>in</em> function parameters as <em>const</em> whenever possible.
|
|
This improves the efficiency of function inlining.
|
|
</li>
|
|
<br>
|
|
<li>To reduce register usage, declare variables within smaller scopes.
|
|
For example, the following code:
|
|
<pre>
|
|
void main()
|
|
{
|
|
vec4 a1, a2, b1, b2;
|
|
gl_Position = expression using a1, a2.
|
|
gl_Color = expression using b1, b2;
|
|
}
|
|
</pre>
|
|
Can be rewritten as follows to use half as many registers:
|
|
<pre>
|
|
void main()
|
|
{
|
|
{
|
|
vec4 a1, a2;
|
|
gl_Position = expression using a1, a2.
|
|
}
|
|
{
|
|
vec4 b1, b2;
|
|
gl_Color = expression using b1, b2;
|
|
}
|
|
}
|
|
</pre>
|
|
Alternately, rather than using several float variables, use
|
|
a vec4 instead. Use swizzling and writemasks to access the
|
|
components of the vec4 as floats.
|
|
</li>
|
|
<br>
|
|
<li>Use the built-in library functions whenever possible.
|
|
For example, instead of writing this:
|
|
<pre>
|
|
float x = 1.0 / sqrt(y);
|
|
</pre>
|
|
Write this:
|
|
<pre>
|
|
float x = inversesqrt(y);
|
|
</pre>
|
|
<li>
|
|
Use ++i when possible as it's more efficient than i++
|
|
</li>
|
|
</ul>
|
|
|
|
|
|
<a name="standalone">
|
|
<h2>Stand-alone GLSL Compiler</h2>
|
|
|
|
<p>
|
|
A unique stand-alone GLSL compiler driver has been added to Mesa.
|
|
<p>
|
|
|
|
<p>
|
|
The stand-alone compiler (like a conventional command-line compiler)
|
|
is a tool that accepts Shading Language programs and emits low-level
|
|
GPU programs.
|
|
</p>
|
|
|
|
<p>
|
|
This tool is useful for:
|
|
<p>
|
|
<ul>
|
|
<li>Inspecting GPU code to gain insight into compilation
|
|
<li>Generating initial GPU code for subsequent hand-tuning
|
|
<li>Debugging the GLSL compiler itself
|
|
</ul>
|
|
|
|
<p>
|
|
After building Mesa, the glslcompiler can be built by manually running:
|
|
</p>
|
|
<pre>
|
|
cd src/mesa/drivers/glslcompiler
|
|
make
|
|
</pre>
|
|
|
|
|
|
<p>
|
|
Here's an example of using the compiler to compile a vertex shader and
|
|
emit GL_ARB_vertex_program-style instructions:
|
|
</p>
|
|
<pre>
|
|
bin/glslcompiler --debug --numbers --fs progs/glsl/CH06-brick.frag.txt
|
|
</pre>
|
|
<p>
|
|
results in:
|
|
</p>
|
|
<pre>
|
|
# Fragment Program/Shader
|
|
0: RCP TEMP[4].x, UNIFORM[2].xxxx;
|
|
1: RCP TEMP[4].y, UNIFORM[2].yyyy;
|
|
2: MUL TEMP[3].xy, VARYING[0], TEMP[4];
|
|
3: MOV TEMP[1], TEMP[3];
|
|
4: MUL TEMP[0].w, TEMP[1].yyyy, CONST[4].xxxx;
|
|
5: FRC TEMP[1].z, TEMP[0].wwww;
|
|
6: SGT.C TEMP[0].w, TEMP[1].zzzz, CONST[4].xxxx;
|
|
7: IF (NE.wwww); # (if false, goto 9);
|
|
8: ADD TEMP[1].x, TEMP[1].xxxx, CONST[4].xxxx;
|
|
9: ENDIF;
|
|
10: FRC TEMP[1].xy, TEMP[1];
|
|
11: SGT TEMP[2].xy, UNIFORM[3], TEMP[1];
|
|
12: MUL TEMP[1].z, TEMP[2].xxxx, TEMP[2].yyyy;
|
|
13: LRP TEMP[0], TEMP[1].zzzz, UNIFORM[0], UNIFORM[1];
|
|
14: MUL TEMP[0].xyz, TEMP[0], VARYING[1].xxxx;
|
|
15: MOV OUTPUT[0].xyz, TEMP[0];
|
|
16: MOV OUTPUT[0].w, CONST[4].yyyy;
|
|
17: END
|
|
</pre>
|
|
|
|
<p>
|
|
Note that some shading language constructs (such as uniform and varying
|
|
variables) aren't expressible in ARB or NV-style programs.
|
|
Therefore, the resulting output is not always legal by definition of
|
|
those program languages.
|
|
</p>
|
|
<p>
|
|
Also note that this compiler driver is still under development.
|
|
Over time, the correctness of the GPU programs, with respect to the ARB
|
|
and NV languagues, should improve.
|
|
</p>
|
|
|
|
|
|
|
|
<a name="implementation">
|
|
<h2>Compiler Implementation</h2>
|
|
|
|
<p>
|
|
The source code for Mesa's shading language compiler is in the
|
|
<code>src/mesa/shader/slang/</code> directory.
|
|
</p>
|
|
|
|
<p>
|
|
The compiler follows a fairly standard design and basically works as follows:
|
|
</p>
|
|
<ul>
|
|
<li>The input string is tokenized (see grammar.c) and parsed
|
|
(see slang_compiler_*.c) to produce an Abstract Syntax Tree (AST).
|
|
The nodes in this tree are slang_operation structures
|
|
(see slang_compile_operation.h).
|
|
The nodes are decorated with symbol table, scoping and datatype information.
|
|
<li>The AST is converted into an Intermediate representation (IR) tree
|
|
(see the slang_codegen.c file).
|
|
The IR nodes represent basic GPU instructions, like add, dot product,
|
|
move, etc.
|
|
The IR tree is mostly a binary tree, but a few nodes have three or four
|
|
children.
|
|
In principle, the IR tree could be executed by doing an in-order traversal.
|
|
<li>The IR tree is traversed in-order to emit code (see slang_emit.c).
|
|
This is also when registers are allocated to store variables and temps.
|
|
<li>In the future, a pattern-matching code generator-generator may be
|
|
used for code generation.
|
|
Programs such as L-BURG (Bottom-Up Rewrite Generator) and Twig look for
|
|
patterns in IR trees, compute weights for subtrees and use the weights
|
|
to select the best instructions to represent the sub-tree.
|
|
<li>The emitted GPU instructions (see prog_instruction.h) are stored in a
|
|
gl_program object (see mtypes.h).
|
|
<li>When a fragment shader and vertex shader are linked (see slang_link.c)
|
|
the varying vars are matched up, uniforms are merged, and vertex
|
|
attributes are resolved (rewriting instructions as needed).
|
|
</ul>
|
|
|
|
<p>
|
|
The final vertex and fragment programs may be interpreted in software
|
|
(see prog_execute.c) or translated into a specific hardware architecture
|
|
(see drivers/dri/i915/i915_fragprog.c for example).
|
|
</p>
|
|
|
|
<h3>Code Generation Options</h3>
|
|
|
|
<p>
|
|
Internally, there are several options that control the compiler's code
|
|
generation and instruction selection.
|
|
These options are seen in the gl_shader_state struct and may be set
|
|
by the device driver to indicate its preferences:
|
|
|
|
<pre>
|
|
struct gl_shader_state
|
|
{
|
|
...
|
|
/** Driver-selectable options: */
|
|
GLboolean EmitHighLevelInstructions;
|
|
GLboolean EmitCondCodes;
|
|
GLboolean EmitComments;
|
|
};
|
|
</pre>
|
|
|
|
<ul>
|
|
<li>EmitHighLevelInstructions
|
|
<br>
|
|
This option controls instruction selection for loops and conditionals.
|
|
If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK
|
|
instructions will be emitted.
|
|
Otherwise, those constructs will be implemented with BRA instructions.
|
|
</li>
|
|
|
|
<li>EmitCondCodes
|
|
<br>
|
|
If set, condition codes (ala GL_NV_fragment_program) will be used for
|
|
branching and looping.
|
|
Otherwise, ordinary registers will be used (the IF instruction will
|
|
examine the first operand's X component and do the if-part if non-zero).
|
|
This option is only relevant if EmitHighLevelInstructions is set.
|
|
</li>
|
|
|
|
<li>EmitComments
|
|
<br>
|
|
If set, instructions will be annoted with comments to help with debugging.
|
|
Extra NOP instructions will also be inserted.
|
|
</br>
|
|
|
|
</ul>
|
|
|
|
|
|
<a name="validation">
|
|
<h2>Compiler Validation</h2>
|
|
|
|
<p>
|
|
A <a href="http://glean.sf.net" target="_parent">Glean</a> test has
|
|
been create to exercise the GLSL compiler.
|
|
</p>
|
|
<p>
|
|
The <em>glsl1</em> test runs over 170 sub-tests to check that the language
|
|
features and built-in functions work properly.
|
|
This test should be run frequently while working on the compiler to catch
|
|
regressions.
|
|
</p>
|
|
<p>
|
|
The test coverage is reasonably broad and complete but additional tests
|
|
should be added.
|
|
</p>
|
|
|
|
|
|
</BODY>
|
|
</HTML>
|