Import Mesa 20.1.7
This commit is contained in:
parent
76a0a851ab
commit
7fa385c847
File diff suppressed because it is too large
Load Diff
@ -36,7 +36,7 @@ depends on the particular driver being used.
|
|||||||
|
|
||||||
<h2>SHA256 checksum</h2>
|
<h2>SHA256 checksum</h2>
|
||||||
<pre>
|
<pre>
|
||||||
TBD.
|
23bed40114b03ad640c95bfe72cc879ed2f941d0d481b77b5204a1fc567fa93c mesa-20.1.6.tar.xz
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
|
|
||||||
|
169
lib/mesa/docs/relnotes/20.1.7.html
Normal file
169
lib/mesa/docs/relnotes/20.1.7.html
Normal file
@ -0,0 +1,169 @@
|
|||||||
|
|
||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta http-equiv="content-type" content="text/html; charset=utf-8">
|
||||||
|
<title>Mesa Release Notes</title>
|
||||||
|
<link rel="stylesheet" type="text/css" href="../mesa.css">
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
|
||||||
|
<div class="header">
|
||||||
|
<h1>The Mesa 3D Graphics Library</h1>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<iframe src="../contents.html"></iframe>
|
||||||
|
<div class="content">
|
||||||
|
|
||||||
|
<h1>Mesa 20.1.7 Release Notes / 2020-09-02</h1>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Mesa 20.1.7 is a bug fix release which fixes bugs found since the 20.1.6 release.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
Mesa 20.1.7 implements the OpenGL 4.6 API, but the version reported by
|
||||||
|
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
|
||||||
|
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
|
||||||
|
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
|
||||||
|
4.6 is <strong>only</strong> available if requested at context creation.
|
||||||
|
Compatibility contexts may report a lower version depending on each driver.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
Mesa 20.1.7 implements the Vulkan 1.2 API, but the version reported by
|
||||||
|
the apiVersion property of the VkPhysicalDeviceProperties struct
|
||||||
|
depends on the particular driver being used.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<h2>SHA256 checksum</h2>
|
||||||
|
<pre>
|
||||||
|
TBD.
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
|
||||||
|
<h2>New features</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>None</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2>Bug fixes</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Road Redemption certain graphic effects rendered white color</li>
|
||||||
|
<li>Intel Vulkan driver crash with alpha-to-coverage</li>
|
||||||
|
<li>error: ‘static_assert’ was not declared in this scope</li>
|
||||||
|
<li>vulkan/wsi/x11: deadlock with Xwayland when compositor holds multiple buffers</li>
|
||||||
|
<li>[RADV/ACO] Death Stranding cause a GPU hung (*ERROR* Waiting for fences timed out!)</li>
|
||||||
|
<li>lp_bld_init.c:172:7: error: implicit declaration of function ‘LLVMAddConstantPropagationPass’; did you mean ‘LLVMAddCorrelatedValuePropagationPass’? [-Werror=implicit-function-declaration]</li>
|
||||||
|
<li>radv: blitting 3D images with linear filter</li>
|
||||||
|
<li><<MESA crashed>> Array Index Out of Range with Graphicsfuzz application</li>
|
||||||
|
<li>Intel Vulkan driver assertion with small xfb buffer</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2>Changes</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<p>Alejandro Piñeiro (2):</p>
|
||||||
|
<li> v3d/packet: fix typo on Set InstanceID/PrimitiveID packet</li>
|
||||||
|
<li> v3d: set instance id to 0 at start of tile</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Alyssa Rosenzweig (6):</p>
|
||||||
|
<li> panfrost: Fix blend leak for render targets 5-8</li>
|
||||||
|
<li> panfrost: Free hash_to_temp map</li>
|
||||||
|
<li> pan/mdg: Free previous liveness</li>
|
||||||
|
<li> panfrost: Use memctx for sysvals</li>
|
||||||
|
<li> panfrost: Free batch->dependencies</li>
|
||||||
|
<li> pan/mdg: Fix perspective combination</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Bas Nieuwenhuizen (1):</p>
|
||||||
|
<li> radv: Fix 3d blits.</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Danylo Piliaiev (3):</p>
|
||||||
|
<li> glsl: Eliminate out-of-bounds triop_vector_insert</li>
|
||||||
|
<li> ir_constant: Return zero on out-of-bounds vector accesses</li>
|
||||||
|
<li> glsl: Eliminate assigments to out-of-bounds elements of vector</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Emil Velikov (1):</p>
|
||||||
|
<li> radv: restrict exported symbols with static llvm</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Eric Engestrom (10):</p>
|
||||||
|
<li> docs/relnotes: add sha256 sums to 20.1.6</li>
|
||||||
|
<li> .pick_status.json: Update to e94c22429b64f419d9a66f04fa5ecdad33f7f5ef</li>
|
||||||
|
<li> .pick_status.json: Mark 9146f596ed1e8854a2a6c9137396a902bc92946c as denominated</li>
|
||||||
|
<li> .pick_status.json: Mark da6d0e3facfe0eb5c7db2d75d6992643d929caff as denominated</li>
|
||||||
|
<li> .pick_status.json: Mark b5558f2d2aa738d90b9e039144ae3ca69bdf92ca as denominated</li>
|
||||||
|
<li> .pick_status.json: Mark c9858fb941ce7e903f608e537b3657c946f86980 as denominated</li>
|
||||||
|
<li> .pick_status.json: Mark ee77951714ff4373261befde6e84f592cc1c769c as denominated</li>
|
||||||
|
<li> .pick_status.json: Mark 7c226116c6c0793d6d9a7dec52ac7cf54b82b57f as denominated</li>
|
||||||
|
<li> .pick_status.json: Mark d7d7687829875e401690219d4a72458fb2bbe4de as denominated</li>
|
||||||
|
<li> scons: bump c++ standard to 14 to match meson</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Jason Ekstrand (5):</p>
|
||||||
|
<li> clover/spirv: Don't call llvm::regularizeLlvmForSpirv</li>
|
||||||
|
<li> intel/nir: Pass the nir_builder by reference in lower_alpha_to_coverage</li>
|
||||||
|
<li> nir: Add a nir_metadata_all enum value</li>
|
||||||
|
<li> intel/nir: Rewrite the guts of lower_alpha_to_coverage</li>
|
||||||
|
<li> intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Jonathan Gray (11):</p>
|
||||||
|
<li> util: unbreak endian detection on OpenBSD</li>
|
||||||
|
<li> util/anon_file: add OpenBSD shm_mkstemp() path</li>
|
||||||
|
<li> meson: build with _ISOC11_SOURCE on OpenBSD</li>
|
||||||
|
<li> meson: conditionally include -ldl in gbm pkg-config file</li>
|
||||||
|
<li> util: futex fixes for OpenBSD</li>
|
||||||
|
<li> util/u_thread: include pthread_np.h if found</li>
|
||||||
|
<li> anv: use os_get_total_physical_memory()</li>
|
||||||
|
<li> util/os_misc: add os_get_available_system_memory()</li>
|
||||||
|
<li> anv: use os_get_available_system_memory()</li>
|
||||||
|
<li> util/os_misc: os_get_available_system_memory() for OpenBSD</li>
|
||||||
|
<li> vulkan: make VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT conditional</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Lionel Landwerlin (3):</p>
|
||||||
|
<li> anv: fix transform feedback surface size</li>
|
||||||
|
<li> intel/perf: store query symbol name</li>
|
||||||
|
<li> intel/perf: fix raw query kernel metric selection</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Marek Olšák (3):</p>
|
||||||
|
<li> st/mesa: don't generate NIR for ARB_vp/fp if NIR is not preferred</li>
|
||||||
|
<li> radeonsi: fix tess levels coming as scalar arrays from SPIR-V</li>
|
||||||
|
<li> gallivm: fix build on LLVM 12 due to LLVMAddConstantPropagationPass removal</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Marek Vasut (2):</p>
|
||||||
|
<li> etnaviv: Remove etna_resource_get_status()</li>
|
||||||
|
<li> etnaviv: Add lock around pending_ctx</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Nanley Chery (1):</p>
|
||||||
|
<li> gallium/dri2: Report correct YUYV and UYVY plane count</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Pierre Moreau (1):</p>
|
||||||
|
<li> clover/spirv: Remove unused tuple header</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Pierre-Eric Pelloux-Prayer (5):</p>
|
||||||
|
<li> mesa/st: introduce PIPE_CAP_NO_CLIP_ON_COPY_TEX</li>
|
||||||
|
<li> radeonsi: enable PIPE_CAP_NO_CLIP_ON_COPY_TEX</li>
|
||||||
|
<li> ac/llvm: add option to clamp division by zero</li>
|
||||||
|
<li> radeonsi,driconf: add clamp_div_by_zero option</li>
|
||||||
|
<li> radeonsi: use radeonsi_clamp_div_by_zero for SPECviewperf13, Road Redemption</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Rhys Perry (1):</p>
|
||||||
|
<li> aco: fix non-rtz pack_half_2x16</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Rob Clark (1):</p>
|
||||||
|
<li> freedreno: handle case of shadowing current render target</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Roman Gilg (2):</p>
|
||||||
|
<li> vulkan/wsi/x11: add sent image counter</li>
|
||||||
|
<li> vulkan/wsi/x11: wait for acquirable images in FIFO mode</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Samuel Pitoiset (1):</p>
|
||||||
|
<li> nir/algebraic: mark some optimizations with fsat(NaN) as inexact</li>
|
||||||
|
<p></p>
|
||||||
|
<p>Vinson Lee (1):</p>
|
||||||
|
<li> vulkan: Fix memory leaks.</li>
|
||||||
|
<p></p>
|
||||||
|
<p></p>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
@ -701,6 +701,9 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr)
|
|||||||
result = emit_intrin_1f_param(&ctx->ac, "llvm.amdgcn.rcp",
|
result = emit_intrin_1f_param(&ctx->ac, "llvm.amdgcn.rcp",
|
||||||
ac_to_float_type(&ctx->ac, def_type), src[0]);
|
ac_to_float_type(&ctx->ac, def_type), src[0]);
|
||||||
}
|
}
|
||||||
|
if (ctx->abi->clamp_div_by_zero)
|
||||||
|
result = ac_build_fmin(&ctx->ac, result,
|
||||||
|
LLVMConstReal(ac_to_float_type(&ctx->ac, def_type), FLT_MAX));
|
||||||
break;
|
break;
|
||||||
case nir_op_iand:
|
case nir_op_iand:
|
||||||
result = LLVMBuildAnd(ctx->ac.builder, src[0], src[1], "");
|
result = LLVMBuildAnd(ctx->ac.builder, src[0], src[1], "");
|
||||||
@ -847,6 +850,9 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr)
|
|||||||
case nir_op_frsq:
|
case nir_op_frsq:
|
||||||
result = emit_intrin_1f_param(&ctx->ac, "llvm.amdgcn.rsq",
|
result = emit_intrin_1f_param(&ctx->ac, "llvm.amdgcn.rsq",
|
||||||
ac_to_float_type(&ctx->ac, def_type), src[0]);
|
ac_to_float_type(&ctx->ac, def_type), src[0]);
|
||||||
|
if (ctx->abi->clamp_div_by_zero)
|
||||||
|
result = ac_build_fmin(&ctx->ac, result,
|
||||||
|
LLVMConstReal(ac_to_float_type(&ctx->ac, def_type), FLT_MAX));
|
||||||
break;
|
break;
|
||||||
case nir_op_frexp_exp:
|
case nir_op_frexp_exp:
|
||||||
src[0] = ac_to_float(&ctx->ac, src[0]);
|
src[0] = ac_to_float(&ctx->ac, src[0]);
|
||||||
|
@ -186,6 +186,9 @@ struct ac_shader_abi {
|
|||||||
|
|
||||||
/* Whether bounds checks are required */
|
/* Whether bounds checks are required */
|
||||||
bool robust_buffer_access;
|
bool robust_buffer_access;
|
||||||
|
|
||||||
|
/* Clamp div by 0 (so it won't produce NaN) */
|
||||||
|
bool clamp_div_by_zero;
|
||||||
};
|
};
|
||||||
|
|
||||||
#endif /* AC_SHADER_ABI_H */
|
#endif /* AC_SHADER_ABI_H */
|
||||||
|
@ -157,6 +157,16 @@ if with_platform_android
|
|||||||
]
|
]
|
||||||
endif
|
endif
|
||||||
|
|
||||||
|
# When static linking LLVM, all its symbols are public API.
|
||||||
|
# That may cause symbol collision, so explicitly demote everything.
|
||||||
|
libvulkan_radeon_ld_args = []
|
||||||
|
libvulkan_radeon_link_depends = []
|
||||||
|
|
||||||
|
if with_llvm and with_ld_version_script
|
||||||
|
libvulkan_radeon_ld_args += ['-Wl,--version-script', join_paths(meson.current_source_dir(), 'vulkan.sym')]
|
||||||
|
libvulkan_radeon_link_depends += files('vulkan.sym')
|
||||||
|
endif
|
||||||
|
|
||||||
libvulkan_radeon = shared_library(
|
libvulkan_radeon = shared_library(
|
||||||
'vulkan_radeon',
|
'vulkan_radeon',
|
||||||
[libradv_files, radv_entrypoints, radv_extensions_c, amd_vk_format_table_c, sha1_h, radv_gfx10_format_table_h],
|
[libradv_files, radv_entrypoints, radv_extensions_c, amd_vk_format_table_c, sha1_h, radv_gfx10_format_table_h],
|
||||||
@ -173,7 +183,8 @@ libvulkan_radeon = shared_library(
|
|||||||
],
|
],
|
||||||
c_args : [c_vis_args, no_override_init_args, radv_flags],
|
c_args : [c_vis_args, no_override_init_args, radv_flags],
|
||||||
cpp_args : [cpp_vis_args, radv_flags],
|
cpp_args : [cpp_vis_args, radv_flags],
|
||||||
link_args : [ld_args_bsymbolic, ld_args_gc_sections],
|
link_args : [ld_args_bsymbolic, ld_args_gc_sections, libvulkan_radeon_ld_args],
|
||||||
|
link_depends : [libvulkan_radeon_link_depends,],
|
||||||
install : true,
|
install : true,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
11
lib/mesa/src/amd/vulkan/vulkan.sym
Normal file
11
lib/mesa/src/amd/vulkan/vulkan.sym
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
{
|
||||||
|
global:
|
||||||
|
vk_icdGetInstanceProcAddr;
|
||||||
|
vk_icdGetPhysicalDeviceProcAddr;
|
||||||
|
vk_icdNegotiateLoaderICDInterfaceVersion;
|
||||||
|
|
||||||
|
local:
|
||||||
|
# When static linking LLVM, all its symbols are public API.
|
||||||
|
# That may cause symbol collision, so explicitly demote everything.
|
||||||
|
*;
|
||||||
|
};
|
@ -630,11 +630,11 @@
|
|||||||
</packet>
|
</packet>
|
||||||
|
|
||||||
<packet code="54" name="Set InstanceID" cl="B" min_ver="41">
|
<packet code="54" name="Set InstanceID" cl="B" min_ver="41">
|
||||||
<field name="Instance ID" size="32" start="32" type="uint"/>
|
<field name="Instance ID" size="32" start="0" type="uint"/>
|
||||||
</packet>
|
</packet>
|
||||||
|
|
||||||
<packet code="55" name="Set PrimitiveID" cl="B" min_ver="41">
|
<packet code="55" name="Set PrimitiveID" cl="B" min_ver="41">
|
||||||
<field name="Primitive ID" size="32" start="32" type="uint"/>
|
<field name="Primitive ID" size="32" start="0" type="uint"/>
|
||||||
</packet>
|
</packet>
|
||||||
|
|
||||||
<packet code="56" name="Prim List Format">
|
<packet code="56" name="Prim List Format">
|
||||||
|
@ -136,16 +136,32 @@ vector_deref_visitor::visit_enter(ir_assignment *ir)
|
|||||||
ir->write_mask = (1 << new_lhs->type->vector_elements) - 1;
|
ir->write_mask = (1 << new_lhs->type->vector_elements) - 1;
|
||||||
ir->set_lhs(new_lhs);
|
ir->set_lhs(new_lhs);
|
||||||
}
|
}
|
||||||
} else if (new_lhs->ir_type != ir_type_swizzle) {
|
} else {
|
||||||
|
unsigned index = old_index_constant->get_uint_component(0);
|
||||||
|
|
||||||
|
if (index >= new_lhs->type->vector_elements) {
|
||||||
|
/* Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
|
||||||
|
*
|
||||||
|
* In the subsections described above for array, vector, matrix and
|
||||||
|
* structure accesses, any out-of-bounds access produced undefined
|
||||||
|
* behavior.... Out-of-bounds writes may be discarded or overwrite
|
||||||
|
* other variables of the active program.
|
||||||
|
*/
|
||||||
|
ir->remove();
|
||||||
|
return visit_continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (new_lhs->ir_type != ir_type_swizzle) {
|
||||||
ir->set_lhs(new_lhs);
|
ir->set_lhs(new_lhs);
|
||||||
ir->write_mask = 1 << old_index_constant->get_uint_component(0);
|
ir->write_mask = 1 << index;
|
||||||
} else {
|
} else {
|
||||||
/* If the "new" LHS is a swizzle, use the set_lhs helper to instead
|
/* If the "new" LHS is a swizzle, use the set_lhs helper to instead
|
||||||
* swizzle the RHS.
|
* swizzle the RHS.
|
||||||
*/
|
*/
|
||||||
unsigned component[1] = { old_index_constant->get_uint_component(0) };
|
unsigned component[1] = { index };
|
||||||
ir->set_lhs(new(mem_ctx) ir_swizzle(new_lhs, component, 1));
|
ir->set_lhs(new(mem_ctx) ir_swizzle(new_lhs, component, 1));
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
return ir_rvalue_enter_visitor::visit_enter(ir);
|
return ir_rvalue_enter_visitor::visit_enter(ir);
|
||||||
}
|
}
|
||||||
|
@ -32,7 +32,8 @@ namespace {
|
|||||||
class vector_insert_visitor : public ir_rvalue_visitor {
|
class vector_insert_visitor : public ir_rvalue_visitor {
|
||||||
public:
|
public:
|
||||||
vector_insert_visitor(bool lower_nonconstant_index)
|
vector_insert_visitor(bool lower_nonconstant_index)
|
||||||
: progress(false), lower_nonconstant_index(lower_nonconstant_index)
|
: progress(false), lower_nonconstant_index(lower_nonconstant_index),
|
||||||
|
remove_assignment(false)
|
||||||
{
|
{
|
||||||
factory.instructions = &factory_instructions;
|
factory.instructions = &factory_instructions;
|
||||||
}
|
}
|
||||||
@ -43,11 +44,13 @@ public:
|
|||||||
}
|
}
|
||||||
|
|
||||||
virtual void handle_rvalue(ir_rvalue **rv);
|
virtual void handle_rvalue(ir_rvalue **rv);
|
||||||
|
virtual ir_visitor_status visit_leave(ir_assignment *expr);
|
||||||
|
|
||||||
ir_factory factory;
|
ir_factory factory;
|
||||||
exec_list factory_instructions;
|
exec_list factory_instructions;
|
||||||
bool progress;
|
bool progress;
|
||||||
bool lower_nonconstant_index;
|
bool lower_nonconstant_index;
|
||||||
|
bool remove_assignment;
|
||||||
};
|
};
|
||||||
|
|
||||||
} /* anonymous namespace */
|
} /* anonymous namespace */
|
||||||
@ -68,6 +71,21 @@ vector_insert_visitor::handle_rvalue(ir_rvalue **rv)
|
|||||||
ir_constant *const idx =
|
ir_constant *const idx =
|
||||||
expr->operands[2]->constant_expression_value(factory.mem_ctx);
|
expr->operands[2]->constant_expression_value(factory.mem_ctx);
|
||||||
if (idx != NULL) {
|
if (idx != NULL) {
|
||||||
|
unsigned index = idx->value.u[0];
|
||||||
|
|
||||||
|
if (index >= expr->operands[0]->type->vector_elements) {
|
||||||
|
/* Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
|
||||||
|
*
|
||||||
|
* In the subsections described above for array, vector, matrix and
|
||||||
|
* structure accesses, any out-of-bounds access produced undefined
|
||||||
|
* behavior.... Out-of-bounds writes may be discarded or overwrite
|
||||||
|
* other variables of the active program.
|
||||||
|
*/
|
||||||
|
this->remove_assignment = true;
|
||||||
|
this->progress = true;
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
/* Replace (vector_insert (vec) (scalar) (index)) with a dereference of
|
/* Replace (vector_insert (vec) (scalar) (index)) with a dereference of
|
||||||
* a new temporary. The new temporary gets assigned as
|
* a new temporary. The new temporary gets assigned as
|
||||||
*
|
*
|
||||||
@ -136,6 +154,19 @@ vector_insert_visitor::handle_rvalue(ir_rvalue **rv)
|
|||||||
base_ir->insert_before(factory.instructions);
|
base_ir->insert_before(factory.instructions);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
ir_visitor_status
|
||||||
|
vector_insert_visitor::visit_leave(ir_assignment *ir)
|
||||||
|
{
|
||||||
|
ir_rvalue_visitor::visit_leave(ir);
|
||||||
|
|
||||||
|
if (this->remove_assignment) {
|
||||||
|
ir->remove();
|
||||||
|
this->remove_assignment = false;
|
||||||
|
}
|
||||||
|
|
||||||
|
return visit_continue;
|
||||||
|
}
|
||||||
|
|
||||||
bool
|
bool
|
||||||
lower_vector_insert(exec_list *instructions, bool lower_nonconstant_index)
|
lower_vector_insert(exec_list *instructions, bool lower_nonconstant_index)
|
||||||
{
|
{
|
||||||
|
@ -433,6 +433,9 @@ u_pipe_screen_get_param_defaults(struct pipe_screen *pscreen,
|
|||||||
case PIPE_CAP_ALPHA_TO_COVERAGE_DITHER_CONTROL:
|
case PIPE_CAP_ALPHA_TO_COVERAGE_DITHER_CONTROL:
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
|
case PIPE_CAP_NO_CLIP_ON_COPY_TEX:
|
||||||
|
return 0;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
unreachable("bad PIPE_CAP_*");
|
unreachable("bad PIPE_CAP_*");
|
||||||
}
|
}
|
||||||
|
@ -166,7 +166,7 @@ panfrost_delete_blend_state(struct pipe_context *pipe,
|
|||||||
{
|
{
|
||||||
struct panfrost_blend_state *blend = (struct panfrost_blend_state *) cso;
|
struct panfrost_blend_state *blend = (struct panfrost_blend_state *) cso;
|
||||||
|
|
||||||
for (unsigned c = 0; c < 4; ++c) {
|
for (unsigned c = 0; c < PIPE_MAX_COLOR_BUFS; ++c) {
|
||||||
struct panfrost_blend_rt *rt = &blend->rt[c];
|
struct panfrost_blend_rt *rt = &blend->rt[c];
|
||||||
_mesa_hash_table_u64_clear(rt->shaders, panfrost_delete_blend_shader);
|
_mesa_hash_table_u64_clear(rt->shaders, panfrost_delete_blend_shader);
|
||||||
}
|
}
|
||||||
|
@ -178,6 +178,8 @@ panfrost_free_batch(struct panfrost_batch *batch)
|
|||||||
panfrost_batch_fence_unreference(*dep);
|
panfrost_batch_fence_unreference(*dep);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
util_dynarray_fini(&batch->dependencies);
|
||||||
|
|
||||||
/* The out_sync fence lifetime is different from the the batch one
|
/* The out_sync fence lifetime is different from the the batch one
|
||||||
* since other batches might want to wait on a fence of already
|
* since other batches might want to wait on a fence of already
|
||||||
* submitted/signaled batch. All we need to do here is make sure the
|
* submitted/signaled batch. All we need to do here is make sure the
|
||||||
|
@ -7,5 +7,6 @@ OPT_BOOL(halt_shaders, false, "Halt shaders at the start (will hang)")
|
|||||||
OPT_BOOL(vs_fetch_always_opencode, false,
|
OPT_BOOL(vs_fetch_always_opencode, false,
|
||||||
"Always open code vertex fetches (less efficient, purely for testing)")
|
"Always open code vertex fetches (less efficient, purely for testing)")
|
||||||
OPT_BOOL(prim_restart_tri_strips_only, false, "Only enable primitive restart for triangle strips")
|
OPT_BOOL(prim_restart_tri_strips_only, false, "Only enable primitive restart for triangle strips")
|
||||||
|
OPT_BOOL(clamp_div_by_zero, false, "Clamp div by zero (x / 0 becomes FLT_MAX instead of NaN)")
|
||||||
|
|
||||||
#undef OPT_BOOL
|
#undef OPT_BOOL
|
||||||
|
@ -162,6 +162,7 @@ static int si_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
|
|||||||
case PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES:
|
case PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES:
|
||||||
case PIPE_CAP_ALPHA_TO_COVERAGE_DITHER_CONTROL:
|
case PIPE_CAP_ALPHA_TO_COVERAGE_DITHER_CONTROL:
|
||||||
case PIPE_CAP_MAP_UNSYNCHRONIZED_THREAD_SAFE:
|
case PIPE_CAP_MAP_UNSYNCHRONIZED_THREAD_SAFE:
|
||||||
|
case PIPE_CAP_NO_CLIP_ON_COPY_TEX:
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
case PIPE_CAP_QUERY_SO_OVERFLOW:
|
case PIPE_CAP_QUERY_SO_OVERFLOW:
|
||||||
|
@ -452,6 +452,7 @@ bool si_nir_build_llvm(struct si_shader_context *ctx, struct nir_shader *nir)
|
|||||||
ctx->abi.inputs = &ctx->inputs[0];
|
ctx->abi.inputs = &ctx->inputs[0];
|
||||||
ctx->abi.clamp_shadow_reference = true;
|
ctx->abi.clamp_shadow_reference = true;
|
||||||
ctx->abi.robust_buffer_access = true;
|
ctx->abi.robust_buffer_access = true;
|
||||||
|
ctx->abi.clamp_div_by_zero = ctx->screen->options.clamp_div_by_zero;
|
||||||
|
|
||||||
if (ctx->shader->selector->info.properties[TGSI_PROPERTY_CS_LOCAL_SIZE]) {
|
if (ctx->shader->selector->info.properties[TGSI_PROPERTY_CS_LOCAL_SIZE]) {
|
||||||
assert(gl_shader_stage_is_compute(nir->info.stage));
|
assert(gl_shader_stage_is_compute(nir->info.stage));
|
||||||
|
@ -513,7 +513,7 @@ static void si_nir_store_output_tcs(struct ac_shader_abi *abi, const struct nir_
|
|||||||
{
|
{
|
||||||
struct si_shader_context *ctx = si_shader_context_from_abi(abi);
|
struct si_shader_context *ctx = si_shader_context_from_abi(abi);
|
||||||
struct si_shader_info *info = &ctx->shader->selector->info;
|
struct si_shader_info *info = &ctx->shader->selector->info;
|
||||||
const unsigned component = var->data.location_frac;
|
unsigned component = var->data.location_frac;
|
||||||
unsigned driver_location = var->data.driver_location;
|
unsigned driver_location = var->data.driver_location;
|
||||||
LLVMValueRef dw_addr, stride;
|
LLVMValueRef dw_addr, stride;
|
||||||
LLVMValueRef buffer, base, addr;
|
LLVMValueRef buffer, base, addr;
|
||||||
@ -521,6 +521,12 @@ static void si_nir_store_output_tcs(struct ac_shader_abi *abi, const struct nir_
|
|||||||
bool skip_lds_store;
|
bool skip_lds_store;
|
||||||
bool is_tess_factor = false, is_tess_inner = false;
|
bool is_tess_factor = false, is_tess_inner = false;
|
||||||
|
|
||||||
|
if (var->data.compact) {
|
||||||
|
component += const_index;
|
||||||
|
writemask <<= const_index;
|
||||||
|
const_index = 0;
|
||||||
|
}
|
||||||
|
|
||||||
driver_location = driver_location / 4;
|
driver_location = driver_location / 4;
|
||||||
ubyte name = info->output_semantic_name[driver_location];
|
ubyte name = info->output_semantic_name[driver_location];
|
||||||
ubyte index = info->output_semantic_index[driver_location];
|
ubyte index = info->output_semantic_index[driver_location];
|
||||||
|
@ -440,6 +440,13 @@ v3d_rcl_emit_generic_per_tile_list(struct v3d_job *job, int layer)
|
|||||||
fmt.primitive_type = LIST_TRIANGLES;
|
fmt.primitive_type = LIST_TRIANGLES;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#if V3D_VERSION >= 41
|
||||||
|
/* PTB assumes that value to be 0, but hw will not set it. */
|
||||||
|
cl_emit(cl, SET_INSTANCEID, set) {
|
||||||
|
set.instance_id = 0;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
cl_emit(cl, BRANCH_TO_IMPLICIT_TILE_LIST, branch);
|
cl_emit(cl, BRANCH_TO_IMPLICIT_TILE_LIST, branch);
|
||||||
|
|
||||||
v3d_rcl_emit_stores(job, cl, layer);
|
v3d_rcl_emit_stores(job, cl, layer);
|
||||||
|
@ -22,7 +22,6 @@
|
|||||||
|
|
||||||
#include "invocation.hpp"
|
#include "invocation.hpp"
|
||||||
|
|
||||||
#include <tuple>
|
|
||||||
#include <unordered_map>
|
#include <unordered_map>
|
||||||
#include <unordered_set>
|
#include <unordered_set>
|
||||||
#include <vector>
|
#include <vector>
|
||||||
|
@ -56,114 +56,134 @@
|
|||||||
* 1.0000 1111111111111111
|
* 1.0000 1111111111111111
|
||||||
*/
|
*/
|
||||||
static nir_ssa_def *
|
static nir_ssa_def *
|
||||||
build_dither_mask(nir_builder b, nir_intrinsic_instr *store_instr)
|
build_dither_mask(nir_builder *b, nir_ssa_def *color)
|
||||||
{
|
{
|
||||||
nir_ssa_def *alpha =
|
assert(color->num_components == 4);
|
||||||
nir_channel(&b, nir_ssa_for_src(&b, store_instr->src[0], 4), 3);
|
nir_ssa_def *alpha = nir_channel(b, color, 3);
|
||||||
|
|
||||||
nir_ssa_def *m =
|
nir_ssa_def *m =
|
||||||
nir_f2i32(&b, nir_fmul_imm(&b, nir_fsat(&b, alpha), 16.0));
|
nir_f2i32(b, nir_fmul_imm(b, nir_fsat(b, alpha), 16.0));
|
||||||
|
|
||||||
nir_ssa_def *part_a =
|
nir_ssa_def *part_a =
|
||||||
nir_iand(&b,
|
nir_iand(b,
|
||||||
nir_imm_int(&b, 0xf),
|
nir_imm_int(b, 0xf),
|
||||||
nir_ushr(&b,
|
nir_ushr(b,
|
||||||
nir_imm_int(&b, 0xfea80),
|
nir_imm_int(b, 0xfea80),
|
||||||
nir_iand(&b, m, nir_imm_int(&b, ~3))));
|
nir_iand(b, m, nir_imm_int(b, ~3))));
|
||||||
|
|
||||||
nir_ssa_def *part_b = nir_iand(&b, m, nir_imm_int(&b, 2));
|
nir_ssa_def *part_b = nir_iand(b, m, nir_imm_int(b, 2));
|
||||||
|
|
||||||
nir_ssa_def *part_c = nir_iand(&b, m, nir_imm_int(&b, 1));
|
nir_ssa_def *part_c = nir_iand(b, m, nir_imm_int(b, 1));
|
||||||
|
|
||||||
return nir_ior(&b,
|
return nir_ior(b,
|
||||||
nir_imul_imm(&b, part_a, 0x1111),
|
nir_imul_imm(b, part_a, 0x1111),
|
||||||
nir_ior(&b,
|
nir_ior(b,
|
||||||
nir_imul_imm(&b, part_b, 0x0808),
|
nir_imul_imm(b, part_b, 0x0808),
|
||||||
nir_imul_imm(&b, part_c, 0x0100)));
|
nir_imul_imm(b, part_c, 0x0100)));
|
||||||
}
|
}
|
||||||
|
|
||||||
void
|
bool
|
||||||
brw_nir_lower_alpha_to_coverage(nir_shader *shader)
|
brw_nir_lower_alpha_to_coverage(nir_shader *shader)
|
||||||
{
|
{
|
||||||
assert(shader->info.stage == MESA_SHADER_FRAGMENT);
|
assert(shader->info.stage == MESA_SHADER_FRAGMENT);
|
||||||
|
nir_function_impl *impl = nir_shader_get_entrypoint(shader);
|
||||||
|
|
||||||
/* Bail out early if we don't have gl_SampleMask */
|
const uint64_t outputs_written = shader->info.outputs_written;
|
||||||
bool is_sample_mask = false;
|
if (!(outputs_written & BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)) ||
|
||||||
nir_foreach_variable(var, &shader->outputs) {
|
!(outputs_written & (BITFIELD64_BIT(FRAG_RESULT_COLOR) |
|
||||||
if (var->data.location == FRAG_RESULT_SAMPLE_MASK) {
|
BITFIELD64_BIT(FRAG_RESULT_DATA0))))
|
||||||
is_sample_mask = true;
|
goto skip;
|
||||||
break;
|
|
||||||
|
nir_intrinsic_instr *sample_mask_write = NULL;
|
||||||
|
nir_intrinsic_instr *color0_write = NULL;
|
||||||
|
bool sample_mask_write_first = false;
|
||||||
|
|
||||||
|
nir_foreach_block(block, impl) {
|
||||||
|
nir_foreach_instr_safe(instr, block) {
|
||||||
|
if (instr->type != nir_instr_type_intrinsic)
|
||||||
|
continue;
|
||||||
|
|
||||||
|
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
|
||||||
|
if (intrin->intrinsic != nir_intrinsic_store_output)
|
||||||
|
continue;
|
||||||
|
|
||||||
|
/* We call nir_lower_io_to_temporaries to lower FS outputs to
|
||||||
|
* temporaries with a copy at the end so this should be the last
|
||||||
|
* block in the shader.
|
||||||
|
*/
|
||||||
|
assert(block->cf_node.parent == &impl->cf_node);
|
||||||
|
assert(nir_cf_node_is_last(&block->cf_node));
|
||||||
|
|
||||||
|
/* See store_output in fs_visitor::nir_emit_fs_intrinsic */
|
||||||
|
const unsigned store_offset = nir_src_as_uint(intrin->src[1]);
|
||||||
|
const unsigned driver_location = nir_intrinsic_base(intrin) +
|
||||||
|
SET_FIELD(store_offset, BRW_NIR_FRAG_OUTPUT_LOCATION);
|
||||||
|
|
||||||
|
/* Extract the FRAG_RESULT */
|
||||||
|
const unsigned location =
|
||||||
|
GET_FIELD(driver_location, BRW_NIR_FRAG_OUTPUT_LOCATION);
|
||||||
|
|
||||||
|
if (location == FRAG_RESULT_SAMPLE_MASK) {
|
||||||
|
assert(sample_mask_write == NULL);
|
||||||
|
sample_mask_write = intrin;
|
||||||
|
sample_mask_write_first = (color0_write == NULL);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (location == FRAG_RESULT_COLOR ||
|
||||||
|
location == FRAG_RESULT_DATA0) {
|
||||||
|
assert(color0_write == NULL);
|
||||||
|
color0_write = intrin;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!is_sample_mask)
|
/* It's possible that shader_info may be out-of-date and the writes to
|
||||||
return;
|
* either gl_SampleMask or the first color value may have been removed.
|
||||||
|
* This can happen if, for instance a nir_ssa_undef is written to the
|
||||||
|
* color value. In that case, just bail and don't do anything rather
|
||||||
|
* than crashing.
|
||||||
|
*/
|
||||||
|
if (color0_write == NULL || sample_mask_write == NULL)
|
||||||
|
goto skip;
|
||||||
|
|
||||||
|
/* It's possible that the color value isn't actually a vec4. In this case,
|
||||||
|
* assuming an alpha of 1.0 and letting the sample mask pass through
|
||||||
|
* unaltered seems like the kindest thing to do to apps.
|
||||||
|
*/
|
||||||
|
assert(color0_write->src[0].is_ssa);
|
||||||
|
nir_ssa_def *color0 = color0_write->src[0].ssa;
|
||||||
|
if (color0->num_components < 4)
|
||||||
|
goto skip;
|
||||||
|
|
||||||
|
assert(sample_mask_write->src[0].is_ssa);
|
||||||
|
nir_ssa_def *sample_mask = sample_mask_write->src[0].ssa;
|
||||||
|
|
||||||
|
if (sample_mask_write_first) {
|
||||||
|
/* If the sample mask write comes before the write to color0, we need
|
||||||
|
* to move it because it's going to use the value from color0 to
|
||||||
|
* compute the sample mask.
|
||||||
|
*/
|
||||||
|
nir_instr_remove(&sample_mask_write->instr);
|
||||||
|
nir_instr_insert(nir_after_instr(&color0_write->instr),
|
||||||
|
&sample_mask_write->instr);
|
||||||
|
}
|
||||||
|
|
||||||
nir_foreach_function(function, shader) {
|
|
||||||
nir_function_impl *impl = function->impl;
|
|
||||||
nir_builder b;
|
nir_builder b;
|
||||||
nir_builder_init(&b, impl);
|
nir_builder_init(&b, impl);
|
||||||
|
|
||||||
nir_foreach_block(block, impl) {
|
/* Combine dither_mask and the gl_SampleMask value */
|
||||||
nir_intrinsic_instr *sample_mask_instr = NULL;
|
b.cursor = nir_before_instr(&sample_mask_write->instr);
|
||||||
nir_intrinsic_instr *store_instr = NULL;
|
nir_ssa_def *dither_mask = build_dither_mask(&b, color0);
|
||||||
|
dither_mask = nir_iand(&b, sample_mask, dither_mask);
|
||||||
nir_foreach_instr_safe(instr, block) {
|
nir_instr_rewrite_src(&sample_mask_write->instr,
|
||||||
if (instr->type == nir_instr_type_intrinsic) {
|
&sample_mask_write->src[0],
|
||||||
nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
|
|
||||||
nir_variable *out = NULL;
|
|
||||||
|
|
||||||
switch (intr->intrinsic) {
|
|
||||||
case nir_intrinsic_store_output:
|
|
||||||
nir_foreach_variable(var, &shader->outputs) {
|
|
||||||
int drvloc = var->data.driver_location;
|
|
||||||
if (nir_intrinsic_base(intr) == drvloc) {
|
|
||||||
out = var;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (out->data.mode != nir_var_shader_out)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
/* save gl_SampleMask instruction pointer */
|
|
||||||
if (out->data.location == FRAG_RESULT_SAMPLE_MASK) {
|
|
||||||
assert(!sample_mask_instr);
|
|
||||||
sample_mask_instr = intr;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* save out_color[0] instruction pointer */
|
|
||||||
if ((out->data.location == FRAG_RESULT_COLOR ||
|
|
||||||
out->data.location == FRAG_RESULT_DATA0)) {
|
|
||||||
nir_src *offset_src = nir_get_io_offset_src(intr);
|
|
||||||
if (nir_src_is_const(*offset_src) && nir_src_as_uint(*offset_src) == 0) {
|
|
||||||
assert(!store_instr);
|
|
||||||
store_instr = intr;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
break;
|
|
||||||
default:
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (sample_mask_instr && store_instr) {
|
|
||||||
b.cursor = nir_before_instr(&store_instr->instr);
|
|
||||||
nir_ssa_def *dither_mask = build_dither_mask(b, store_instr);
|
|
||||||
|
|
||||||
/* Combine dither_mask and reorder gl_SampleMask store instruction
|
|
||||||
* after render target 0 store instruction.
|
|
||||||
*/
|
|
||||||
nir_instr_remove(&sample_mask_instr->instr);
|
|
||||||
dither_mask = nir_iand(&b, sample_mask_instr->src[0].ssa, dither_mask);
|
|
||||||
nir_instr_insert_after(&store_instr->instr, &sample_mask_instr->instr);
|
|
||||||
nir_instr_rewrite_src(&sample_mask_instr->instr,
|
|
||||||
&sample_mask_instr->src[0],
|
|
||||||
nir_src_for_ssa(dither_mask));
|
nir_src_for_ssa(dither_mask));
|
||||||
}
|
|
||||||
}
|
|
||||||
nir_metadata_preserve(impl, nir_metadata_block_index |
|
nir_metadata_preserve(impl, nir_metadata_block_index |
|
||||||
nir_metadata_dominance);
|
nir_metadata_dominance);
|
||||||
}
|
return true;
|
||||||
|
|
||||||
|
skip:
|
||||||
|
nir_metadata_preserve(impl, nir_metadata_all);
|
||||||
|
return false;
|
||||||
}
|
}
|
||||||
|
@ -621,6 +621,19 @@ load_oa_metrics(struct gen_perf_config *perf, int fd,
|
|||||||
else
|
else
|
||||||
enumerate_sysfs_metrics(perf);
|
enumerate_sysfs_metrics(perf);
|
||||||
|
|
||||||
|
/* Select a fallback OA metric. Look for the TestOa metric or use the last
|
||||||
|
* one if no present (on HSW).
|
||||||
|
*/
|
||||||
|
for (int i = 0; i < perf->n_queries; i++) {
|
||||||
|
if (perf->queries[i].symbol_name &&
|
||||||
|
strcmp(perf->queries[i].symbol_name, "TestOa") == 0) {
|
||||||
|
perf->fallback_raw_oa_metric = perf->queries[i].oa_metrics_set_id;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (perf->fallback_raw_oa_metric == 0)
|
||||||
|
perf->fallback_raw_oa_metric = perf->queries[perf->n_queries - 1].oa_metrics_set_id;
|
||||||
|
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -170,6 +170,7 @@ struct gen_perf_query_info {
|
|||||||
GEN_PERF_QUERY_TYPE_PIPELINE,
|
GEN_PERF_QUERY_TYPE_PIPELINE,
|
||||||
} kind;
|
} kind;
|
||||||
const char *name;
|
const char *name;
|
||||||
|
const char *symbol_name;
|
||||||
const char *guid;
|
const char *guid;
|
||||||
struct gen_perf_query_counter *counters;
|
struct gen_perf_query_counter *counters;
|
||||||
int n_counters;
|
int n_counters;
|
||||||
@ -227,6 +228,11 @@ struct gen_perf_config {
|
|||||||
*/
|
*/
|
||||||
struct hash_table *oa_metrics_table;
|
struct hash_table *oa_metrics_table;
|
||||||
|
|
||||||
|
/* When MDAPI hasn't configured the metric we need to use by the time the
|
||||||
|
* query begins, this OA metric is used as a fallback.
|
||||||
|
*/
|
||||||
|
uint64_t fallback_raw_oa_metric;
|
||||||
|
|
||||||
/* Location of the device's sysfs entry. */
|
/* Location of the device's sysfs entry. */
|
||||||
char sysfs_dev_dir[256];
|
char sysfs_dev_dir[256];
|
||||||
|
|
||||||
|
@ -667,6 +667,7 @@ def main():
|
|||||||
|
|
||||||
c(".kind = GEN_PERF_QUERY_TYPE_OA,\n")
|
c(".kind = GEN_PERF_QUERY_TYPE_OA,\n")
|
||||||
c(".name = \"" + set.name + "\",\n")
|
c(".name = \"" + set.name + "\",\n")
|
||||||
|
c(".symbol_name = \"" + set.symbol_name + "\",\n")
|
||||||
c(".guid = \"" + set.hw_config_guid + "\",\n")
|
c(".guid = \"" + set.hw_config_guid + "\",\n")
|
||||||
|
|
||||||
c(".counters = {0}_{1}_query_counters,".format(gen.chipset, set.underscore_name))
|
c(".counters = {0}_{1}_query_counters,".format(gen.chipset, set.underscore_name))
|
||||||
|
@ -423,7 +423,7 @@ get_metric_id(struct gen_perf_config *perf,
|
|||||||
if (!gen_perf_load_metric_id(perf, query->guid,
|
if (!gen_perf_load_metric_id(perf, query->guid,
|
||||||
&raw_query->oa_metrics_set_id)) {
|
&raw_query->oa_metrics_set_id)) {
|
||||||
DBG("Unable to read query guid=%s ID, falling back to test config\n", query->guid);
|
DBG("Unable to read query guid=%s ID, falling back to test config\n", query->guid);
|
||||||
raw_query->oa_metrics_set_id = 1ULL;
|
raw_query->oa_metrics_set_id = perf->fallback_raw_oa_metric;
|
||||||
} else {
|
} else {
|
||||||
DBG("Raw query '%s'guid=%s loaded ID: %"PRIu64"\n",
|
DBG("Raw query '%s'guid=%s loaded ID: %"PRIu64"\n",
|
||||||
query->name, query->guid, query->oa_metrics_set_id);
|
query->name, query->guid, query->oa_metrics_set_id);
|
||||||
|
@ -1076,7 +1076,7 @@ bifrost_compile_shader_nir(nir_shader *nir, panfrost_program *program, unsigned
|
|||||||
bi_optimize_nir(nir);
|
bi_optimize_nir(nir);
|
||||||
nir_print_shader(nir, stdout);
|
nir_print_shader(nir, stdout);
|
||||||
|
|
||||||
panfrost_nir_assign_sysvals(&ctx->sysvals, nir);
|
panfrost_nir_assign_sysvals(&ctx->sysvals, ctx, nir);
|
||||||
program->sysval_count = ctx->sysvals.sysval_count;
|
program->sysval_count = ctx->sysvals.sysval_count;
|
||||||
memcpy(program->sysvals, ctx->sysvals.sysvals, sizeof(ctx->sysvals.sysvals[0]) * ctx->sysvals.sysval_count);
|
memcpy(program->sysvals, ctx->sysvals.sysvals, sizeof(ctx->sysvals.sysvals[0]) * ctx->sysvals.sysval_count);
|
||||||
ctx->blend_types = program->blend_types;
|
ctx->blend_types = program->blend_types;
|
||||||
|
@ -257,8 +257,6 @@ typedef struct compiler_context {
|
|||||||
/* Constants which have been loaded, for later inlining */
|
/* Constants which have been loaded, for later inlining */
|
||||||
struct hash_table_u64 *ssa_constants;
|
struct hash_table_u64 *ssa_constants;
|
||||||
|
|
||||||
/* Mapping of hashes computed from NIR indices to the sequential temp indices ultimately used in MIR */
|
|
||||||
struct hash_table_u64 *hash_to_temp;
|
|
||||||
int temp_count;
|
int temp_count;
|
||||||
int max_hash;
|
int max_hash;
|
||||||
|
|
||||||
|
@ -2562,7 +2562,6 @@ midgard_compile_shader_nir(nir_shader *nir, panfrost_program *program, bool is_b
|
|||||||
/* Initialize at a global (not block) level hash tables */
|
/* Initialize at a global (not block) level hash tables */
|
||||||
|
|
||||||
ctx->ssa_constants = _mesa_hash_table_u64_create(NULL);
|
ctx->ssa_constants = _mesa_hash_table_u64_create(NULL);
|
||||||
ctx->hash_to_temp = _mesa_hash_table_u64_create(NULL);
|
|
||||||
|
|
||||||
/* Lower gl_Position pre-optimisation, but after lowering vars to ssa
|
/* Lower gl_Position pre-optimisation, but after lowering vars to ssa
|
||||||
* (so we don't accidentally duplicate the epilogue since mesa/st has
|
* (so we don't accidentally duplicate the epilogue since mesa/st has
|
||||||
@ -2598,7 +2597,7 @@ midgard_compile_shader_nir(nir_shader *nir, panfrost_program *program, bool is_b
|
|||||||
/* Assign sysvals and counts, now that we're sure
|
/* Assign sysvals and counts, now that we're sure
|
||||||
* (post-optimisation) */
|
* (post-optimisation) */
|
||||||
|
|
||||||
panfrost_nir_assign_sysvals(&ctx->sysvals, nir);
|
panfrost_nir_assign_sysvals(&ctx->sysvals, ctx, nir);
|
||||||
program->sysval_count = ctx->sysvals.sysval_count;
|
program->sysval_count = ctx->sysvals.sysval_count;
|
||||||
memcpy(program->sysvals, ctx->sysvals.sysvals, sizeof(ctx->sysvals.sysvals[0]) * ctx->sysvals.sysval_count);
|
memcpy(program->sysvals, ctx->sysvals.sysvals, sizeof(ctx->sysvals.sysvals[0]) * ctx->sysvals.sysval_count);
|
||||||
|
|
||||||
|
@ -88,6 +88,7 @@ midgard_opt_combine_projection(compiler_context *ctx, midgard_block *block)
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (!frcp_found) continue;
|
if (!frcp_found) continue;
|
||||||
|
if (frcp_from != ins->src[0]) continue;
|
||||||
if (frcp_component != COMPONENT_W && frcp_component != COMPONENT_Z) continue;
|
if (frcp_component != COMPONENT_W && frcp_component != COMPONENT_Z) continue;
|
||||||
if (!mir_single_use(ctx, frcp)) continue;
|
if (!mir_single_use(ctx, frcp)) continue;
|
||||||
|
|
||||||
|
@ -30,13 +30,14 @@
|
|||||||
* as such */
|
* as such */
|
||||||
|
|
||||||
static unsigned
|
static unsigned
|
||||||
find_or_allocate_temp(compiler_context *ctx, unsigned hash)
|
find_or_allocate_temp(compiler_context *ctx, struct hash_table_u64 *map,
|
||||||
|
unsigned hash)
|
||||||
{
|
{
|
||||||
if (hash >= SSA_FIXED_MINIMUM)
|
if (hash >= SSA_FIXED_MINIMUM)
|
||||||
return hash;
|
return hash;
|
||||||
|
|
||||||
unsigned temp = (uintptr_t) _mesa_hash_table_u64_search(
|
unsigned temp = (uintptr_t) _mesa_hash_table_u64_search(
|
||||||
ctx->hash_to_temp, hash + 1);
|
map, hash + 1);
|
||||||
|
|
||||||
if (temp)
|
if (temp)
|
||||||
return temp - 1;
|
return temp - 1;
|
||||||
@ -45,7 +46,7 @@ find_or_allocate_temp(compiler_context *ctx, unsigned hash)
|
|||||||
temp = ctx->temp_count++;
|
temp = ctx->temp_count++;
|
||||||
ctx->max_hash = MAX2(ctx->max_hash, hash);
|
ctx->max_hash = MAX2(ctx->max_hash, hash);
|
||||||
|
|
||||||
_mesa_hash_table_u64_insert(ctx->hash_to_temp,
|
_mesa_hash_table_u64_insert(map,
|
||||||
hash + 1, (void *) ((uintptr_t) temp + 1));
|
hash + 1, (void *) ((uintptr_t) temp + 1));
|
||||||
|
|
||||||
return temp;
|
return temp;
|
||||||
@ -57,10 +58,10 @@ find_or_allocate_temp(compiler_context *ctx, unsigned hash)
|
|||||||
void
|
void
|
||||||
mir_squeeze_index(compiler_context *ctx)
|
mir_squeeze_index(compiler_context *ctx)
|
||||||
{
|
{
|
||||||
|
struct hash_table_u64 *map = _mesa_hash_table_u64_create(NULL);
|
||||||
|
|
||||||
/* Reset */
|
/* Reset */
|
||||||
ctx->temp_count = 0;
|
ctx->temp_count = 0;
|
||||||
/* TODO don't leak old hash_to_temp */
|
|
||||||
ctx->hash_to_temp = _mesa_hash_table_u64_create(NULL);
|
|
||||||
|
|
||||||
/* We need to prioritize texture registers on older GPUs so we don't
|
/* We need to prioritize texture registers on older GPUs so we don't
|
||||||
* fail RA trying to assign to work registers r0/r1 when a work
|
* fail RA trying to assign to work registers r0/r1 when a work
|
||||||
@ -68,14 +69,16 @@ mir_squeeze_index(compiler_context *ctx)
|
|||||||
|
|
||||||
mir_foreach_instr_global(ctx, ins) {
|
mir_foreach_instr_global(ctx, ins) {
|
||||||
if (ins->type == TAG_TEXTURE_4)
|
if (ins->type == TAG_TEXTURE_4)
|
||||||
ins->dest = find_or_allocate_temp(ctx, ins->dest);
|
ins->dest = find_or_allocate_temp(ctx, map, ins->dest);
|
||||||
}
|
}
|
||||||
|
|
||||||
mir_foreach_instr_global(ctx, ins) {
|
mir_foreach_instr_global(ctx, ins) {
|
||||||
if (ins->type != TAG_TEXTURE_4)
|
if (ins->type != TAG_TEXTURE_4)
|
||||||
ins->dest = find_or_allocate_temp(ctx, ins->dest);
|
ins->dest = find_or_allocate_temp(ctx, map, ins->dest);
|
||||||
|
|
||||||
for (unsigned i = 0; i < ARRAY_SIZE(ins->src); ++i)
|
for (unsigned i = 0; i < ARRAY_SIZE(ins->src); ++i)
|
||||||
ins->src[i] = find_or_allocate_temp(ctx, ins->src[i]);
|
ins->src[i] = find_or_allocate_temp(ctx, map, ins->src[i]);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
_mesa_hash_table_u64_destroy(map, NULL);
|
||||||
}
|
}
|
||||||
|
@ -77,7 +77,7 @@ struct panfrost_sysvals {
|
|||||||
};
|
};
|
||||||
|
|
||||||
void
|
void
|
||||||
panfrost_nir_assign_sysvals(struct panfrost_sysvals *ctx, nir_shader *shader);
|
panfrost_nir_assign_sysvals(struct panfrost_sysvals *ctx, void *memctx, nir_shader *shader);
|
||||||
|
|
||||||
int
|
int
|
||||||
panfrost_sysval_for_instr(nir_instr *instr, nir_dest *dest);
|
panfrost_sysval_for_instr(nir_instr *instr, nir_dest *dest);
|
||||||
|
@ -128,11 +128,13 @@ pan_compute_liveness(
|
|||||||
_mesa_hash_pointer,
|
_mesa_hash_pointer,
|
||||||
_mesa_key_pointer_equal);
|
_mesa_key_pointer_equal);
|
||||||
|
|
||||||
/* Allocate */
|
/* Free any previous liveness, and allocate */
|
||||||
|
|
||||||
|
pan_free_liveness(blocks);
|
||||||
|
|
||||||
list_for_each_entry(pan_block, block, blocks, link) {
|
list_for_each_entry(pan_block, block, blocks, link) {
|
||||||
block->live_in = rzalloc_array(NULL, uint16_t, temp_count);
|
block->live_in = rzalloc_array(block, uint16_t, temp_count);
|
||||||
block->live_out = rzalloc_array(NULL, uint16_t, temp_count);
|
block->live_out = rzalloc_array(block, uint16_t, temp_count);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Initialize the work list with the exit block */
|
/* Initialize the work list with the exit block */
|
||||||
|
@ -124,10 +124,10 @@ panfrost_nir_assign_sysval_body(struct panfrost_sysvals *ctx, nir_instr *instr)
|
|||||||
}
|
}
|
||||||
|
|
||||||
void
|
void
|
||||||
panfrost_nir_assign_sysvals(struct panfrost_sysvals *ctx, nir_shader *shader)
|
panfrost_nir_assign_sysvals(struct panfrost_sysvals *ctx, void *memctx, nir_shader *shader)
|
||||||
{
|
{
|
||||||
ctx->sysval_count = 0;
|
ctx->sysval_count = 0;
|
||||||
ctx->sysval_to_id = _mesa_hash_table_u64_create(NULL);
|
ctx->sysval_to_id = _mesa_hash_table_u64_create(memctx);
|
||||||
|
|
||||||
nir_foreach_function(function, shader) {
|
nir_foreach_function(function, shader) {
|
||||||
if (!function->impl) continue;
|
if (!function->impl) continue;
|
||||||
|
@ -637,6 +637,12 @@ TODO: document the other workarounds.
|
|||||||
<application name="Peace, Death!" executable="runner" sha1="5b909f3d21799773370adf084f649848f098234e">
|
<application name="Peace, Death!" executable="runner" sha1="5b909f3d21799773370adf084f649848f098234e">
|
||||||
<option name="radeonsi_sync_compile" value="true" />
|
<option name="radeonsi_sync_compile" value="true" />
|
||||||
</application>
|
</application>
|
||||||
|
<application name="SPECviewperf13" executable="viewperf">
|
||||||
|
<option name="radeonsi_clamp_div_by_zero" value="true" />
|
||||||
|
</application>
|
||||||
|
<application name="Road Redemption" executable="RoadRedemption.x86_64">
|
||||||
|
<option name="radeonsi_clamp_div_by_zero" value="true" />
|
||||||
|
</application>
|
||||||
</device>
|
</device>
|
||||||
<device driver="virtio_gpu">
|
<device driver="virtio_gpu">
|
||||||
<!-- Some Valve games do a final blit to a BRGA_sRGB surface. On a GLES
|
<!-- Some Valve games do a final blit to a BRGA_sRGB surface. On a GLES
|
||||||
|
@ -137,14 +137,17 @@ static VkResult device_select_CreateInstance(const VkInstanceCreateInfo *pCreate
|
|||||||
PFN_vkCreateInstance fpCreateInstance =
|
PFN_vkCreateInstance fpCreateInstance =
|
||||||
(PFN_vkCreateInstance)info->GetInstanceProcAddr(NULL, "vkCreateInstance");
|
(PFN_vkCreateInstance)info->GetInstanceProcAddr(NULL, "vkCreateInstance");
|
||||||
if (fpCreateInstance == NULL) {
|
if (fpCreateInstance == NULL) {
|
||||||
|
free(info);
|
||||||
return VK_ERROR_INITIALIZATION_FAILED;
|
return VK_ERROR_INITIALIZATION_FAILED;
|
||||||
}
|
}
|
||||||
|
|
||||||
chain_info->u.pLayerInfo = chain_info->u.pLayerInfo->pNext;
|
chain_info->u.pLayerInfo = chain_info->u.pLayerInfo->pNext;
|
||||||
|
|
||||||
VkResult result = fpCreateInstance(pCreateInfo, pAllocator, pInstance);
|
VkResult result = fpCreateInstance(pCreateInfo, pAllocator, pInstance);
|
||||||
if (result != VK_SUCCESS)
|
if (result != VK_SUCCESS) {
|
||||||
|
free(info);
|
||||||
return result;
|
return result;
|
||||||
|
}
|
||||||
|
|
||||||
for (unsigned i = 0; i < pCreateInfo->enabledExtensionCount; i++) {
|
for (unsigned i = 0; i < pCreateInfo->enabledExtensionCount; i++) {
|
||||||
if (!strcmp(pCreateInfo->ppEnabledExtensionNames[i], VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME))
|
if (!strcmp(pCreateInfo->ppEnabledExtensionNames[i], VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME))
|
||||||
|
Loading…
Reference in New Issue
Block a user