From 9787ba43eeaebe2d61c701f27a5b50d095533b9b Mon Sep 17 00:00:00 2001 From: Keith Randall Date: Mon, 10 Aug 2015 13:40:28 -0700 Subject: [PATCH] [dev.ssa] cmd/compile/internal/ssa: Update TODO list Change-Id: Ibcd4c6984c8728fd9ab76e0c7df555984deaf281 Reviewed-on: https://go-review.googlesource.com/13471 Reviewed-by: Josh Bleecher Snyder --- src/cmd/compile/internal/ssa/TODO | 129 +++++++++++------------ src/cmd/compile/internal/ssa/schedule.go | 2 + 2 files changed, 66 insertions(+), 65 deletions(-) diff --git a/src/cmd/compile/internal/ssa/TODO b/src/cmd/compile/internal/ssa/TODO index f77c5ad8f3..9f8225852c 100644 --- a/src/cmd/compile/internal/ssa/TODO +++ b/src/cmd/compile/internal/ssa/TODO @@ -1,71 +1,70 @@ -This is a list of things that need to be worked on. It is by no means complete. +This is a list of things that need to be worked on. It will hopefully +be complete soon. -Allocation -- Allocation of decls in stackalloc. Decls survive if they are - addrtaken or are too large for registerization. +Coverage +-------- +- Floating point numbers +- Complex numbers +- Integer division +- Fat objects (strings/slices/interfaces) vs. Phi +- Defer? +- Closure args +- PHEAP vars -Scheduling - - Make sure loads are scheduled correctly with respect to stores. - Same for flag type values. We can't have more than one value of - mem or flag types live at once. - - Reduce register pressure. Schedule instructions which kill - variables first. +Correctness +----------- +- GC maps +- Write barriers +- Debugging info +- Handle flags register correctly (clobber/spill/restore) +- Proper panic edges from checks & calls (+deferreturn) +- Can/should we move control values out of their basic block? +- Anything to do for the race detector? +- Slicing details (avoid ptr to next object) -Values - - Store *Type instead of Type? Keep an array of used Types in Func - and reference by id? Unify with the type ../gc so we just use a - pointer instead of an interface? - - Recycle dead values instead of using GC to do that. - - A lot of Aux fields are just int64. Add a separate AuxInt field? - If not that, then cache the interfaces that wrap int64s. - - OpStore uses 3 args. Increase the size of argstorage to 3? +Optimizations (better compiled code) +------------------------------------ +- Reduce register pressure in scheduler +- More strength reduction: multiply -> shift/add combos (Worth doing?) +- Strength reduction: constant divides -> multiply +- Expand current optimizations to all bit widths +- Nil/bounds check removal +- Combining nil checks with subsequent load +- Implement memory zeroing with REPSTOSQ and DuffZero +- Implement memory copying with REPMOVSQ and DuffCopy +- Make deadstore work with zeroing +- Branch prediction: Respect hints from the frontend, add our own +- Add a value range propagation pass (for bounds elim & bitwidth reduction) +- Stackalloc: group pointer-containing variables & spill slots together +- Stackalloc: organize values to allow good packing +- Regalloc: use arg slots as the home for arguments (don't copy args to locals) +- Reuse stack slots for noninterfering & compatible values (but see issue 8740) +- (x86) Combine loads into other ops +- (x86) More combining address arithmetic into loads/stores + +Optimizations (better compiler) +------------------------------- +- Smaller Value.Type (int32 or ptr)? Get rid of types altogether? +- Recycle dead Values (and Blocks) explicitly instead of using GC +- OpStore uses 3 args. Increase the size of Value.argstorage to 3? +- Constant cache +- Reuseable slices (e.g. []int of size NumValues()) cached in Func Regalloc - - Make less arch-dependent - - Don't spill everything at every basic block boundary. - - Allow args and return values to be ssa-able. - - Handle 2-address instructions. - - Floating point registers - - Make calls clobber all registers - - Make liveness analysis non-quadratic. - - Handle in-place instructions (like XORQconst) directly: - Use XORQ AX, 1 rather than MOVQ AX, BX; XORQ BX, 1. +-------- +- Make less arch-dependent +- Don't spill everything at every basic block boundary +- Allow args and return values to be ssa-able +- Handle 2-address instructions +- Make calls clobber all registers +- Make liveness analysis non-quadratic +- Materialization of constants -StackAlloc: - - Sort variables so all ptr-containing ones are first (so stack - maps are smaller) - - Reuse stack slots for noninterfering and type-compatible variables - (both AUTOs and spilled Values). But see issue 8740 for what - "type-compatible variables" mean and what DWARF information provides. - -Rewrites - - Strength reduction (both arch-indep and arch-dependent?) - - Start another architecture (arm?) - - 64-bit ops on 32-bit machines - - (MOVLstore x m) - to get rid of most of the MOVLQSX. - - Determine which nil checks can be done implicitly (by faulting) - and which need code generated, and do the code generation. - -Common-Subexpression Elimination - - Make better decision about which value in an equivalence class we should - choose to replace other values in that class. - - Can we move control values out of their basic block? - This would break nilcheckelim as currently implemented, - but it could be replaced by a similar CFG simplication pass. - - Investigate type equality. During SSA generation, should we use n.Type or (say) TypeBool? - Should we get rid of named types in favor of underlying types during SSA generation? - Should we introduce a new type equality routine that is less strict than the frontend's? - -Other - - Write barriers - - For testing, do something more sophisticated than - checkOpcodeCounts. Michael Matloob suggests using a similar - pattern matcher to the rewrite engine to check for certain - expression subtrees in the output. - - Implement memory zeroing with REPSTOSQ and DuffZero - - make deadstore work with zeroing. - - Add a value range propagation optimization pass. - Use it for bounds check elimination and bitwidth reduction. - - Branch prediction: Respect hints from the frontend, add our own. +Future/other +------------ +- Start another architecture (arm?) +- 64-bit ops on 32-bit machines +- Investigate type equality. During SSA generation, should we use n.Type or (say) TypeBool? +- Should we get rid of named types in favor of underlying types during SSA generation? +- Should we introduce a new type equality routine that is less strict than the frontend's? +- Infrastructure for enabling/disabling/configuring passes diff --git a/src/cmd/compile/internal/ssa/schedule.go b/src/cmd/compile/internal/ssa/schedule.go index 8388695fa8..de0b4acbf4 100644 --- a/src/cmd/compile/internal/ssa/schedule.go +++ b/src/cmd/compile/internal/ssa/schedule.go @@ -30,6 +30,8 @@ func schedule(f *Func) { for _, b := range f.Blocks { // Find store chain for block. + // Store chains for different blocks overwrite each other, so + // the calculated store chain is good only for this block. for _, v := range b.Values { if v.Op != OpPhi && v.Type.IsMemory() { for _, w := range v.Args {