1
0
mirror of https://github.com/golang/go synced 2024-10-05 20:31:20 -06:00

[dev.ssa] cmd/compile/internal/ssa: Update TODO list

Change-Id: Ibcd4c6984c8728fd9ab76e0c7df555984deaf281
Reviewed-on: https://go-review.googlesource.com/13471
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This commit is contained in:
Keith Randall 2015-08-10 13:40:28 -07:00
parent baf2c3ec4a
commit 9787ba43ee
2 changed files with 66 additions and 65 deletions

View File

@ -1,71 +1,70 @@
This is a list of things that need to be worked on. It is by no means complete. This is a list of things that need to be worked on. It will hopefully
be complete soon.
Allocation Coverage
- Allocation of decls in stackalloc. Decls survive if they are --------
addrtaken or are too large for registerization. - Floating point numbers
- Complex numbers
- Integer division
- Fat objects (strings/slices/interfaces) vs. Phi
- Defer?
- Closure args
- PHEAP vars
Scheduling Correctness
- Make sure loads are scheduled correctly with respect to stores. -----------
Same for flag type values. We can't have more than one value of - GC maps
mem or flag types live at once. - Write barriers
- Reduce register pressure. Schedule instructions which kill - Debugging info
variables first. - Handle flags register correctly (clobber/spill/restore)
- Proper panic edges from checks & calls (+deferreturn)
- Can/should we move control values out of their basic block?
- Anything to do for the race detector?
- Slicing details (avoid ptr to next object)
Values Optimizations (better compiled code)
- Store *Type instead of Type? Keep an array of used Types in Func ------------------------------------
and reference by id? Unify with the type ../gc so we just use a - Reduce register pressure in scheduler
pointer instead of an interface? - More strength reduction: multiply -> shift/add combos (Worth doing?)
- Recycle dead values instead of using GC to do that. - Strength reduction: constant divides -> multiply
- A lot of Aux fields are just int64. Add a separate AuxInt field? - Expand current optimizations to all bit widths
If not that, then cache the interfaces that wrap int64s. - Nil/bounds check removal
- OpStore uses 3 args. Increase the size of argstorage to 3? - Combining nil checks with subsequent load
- Implement memory zeroing with REPSTOSQ and DuffZero
- Implement memory copying with REPMOVSQ and DuffCopy
- Make deadstore work with zeroing
- Branch prediction: Respect hints from the frontend, add our own
- Add a value range propagation pass (for bounds elim & bitwidth reduction)
- Stackalloc: group pointer-containing variables & spill slots together
- Stackalloc: organize values to allow good packing
- Regalloc: use arg slots as the home for arguments (don't copy args to locals)
- Reuse stack slots for noninterfering & compatible values (but see issue 8740)
- (x86) Combine loads into other ops
- (x86) More combining address arithmetic into loads/stores
Optimizations (better compiler)
-------------------------------
- Smaller Value.Type (int32 or ptr)? Get rid of types altogether?
- Recycle dead Values (and Blocks) explicitly instead of using GC
- OpStore uses 3 args. Increase the size of Value.argstorage to 3?
- Constant cache
- Reuseable slices (e.g. []int of size NumValues()) cached in Func
Regalloc Regalloc
- Make less arch-dependent --------
- Don't spill everything at every basic block boundary. - Make less arch-dependent
- Allow args and return values to be ssa-able. - Don't spill everything at every basic block boundary
- Handle 2-address instructions. - Allow args and return values to be ssa-able
- Floating point registers - Handle 2-address instructions
- Make calls clobber all registers - Make calls clobber all registers
- Make liveness analysis non-quadratic. - Make liveness analysis non-quadratic
- Handle in-place instructions (like XORQconst) directly: - Materialization of constants
Use XORQ AX, 1 rather than MOVQ AX, BX; XORQ BX, 1.
StackAlloc: Future/other
- Sort variables so all ptr-containing ones are first (so stack ------------
maps are smaller) - Start another architecture (arm?)
- Reuse stack slots for noninterfering and type-compatible variables - 64-bit ops on 32-bit machines
(both AUTOs and spilled Values). But see issue 8740 for what - Investigate type equality. During SSA generation, should we use n.Type or (say) TypeBool?
"type-compatible variables" mean and what DWARF information provides. - Should we get rid of named types in favor of underlying types during SSA generation?
- Should we introduce a new type equality routine that is less strict than the frontend's?
Rewrites - Infrastructure for enabling/disabling/configuring passes
- Strength reduction (both arch-indep and arch-dependent?)
- Start another architecture (arm?)
- 64-bit ops on 32-bit machines
- <regwidth ops. For example, x+y on int32s on amd64 needs (MOVLQSX (ADDL x y)).
Then add rewrites like (MOVLstore (MOVLQSX x) m) -> (MOVLstore x m)
to get rid of most of the MOVLQSX.
- Determine which nil checks can be done implicitly (by faulting)
and which need code generated, and do the code generation.
Common-Subexpression Elimination
- Make better decision about which value in an equivalence class we should
choose to replace other values in that class.
- Can we move control values out of their basic block?
This would break nilcheckelim as currently implemented,
but it could be replaced by a similar CFG simplication pass.
- Investigate type equality. During SSA generation, should we use n.Type or (say) TypeBool?
Should we get rid of named types in favor of underlying types during SSA generation?
Should we introduce a new type equality routine that is less strict than the frontend's?
Other
- Write barriers
- For testing, do something more sophisticated than
checkOpcodeCounts. Michael Matloob suggests using a similar
pattern matcher to the rewrite engine to check for certain
expression subtrees in the output.
- Implement memory zeroing with REPSTOSQ and DuffZero
- make deadstore work with zeroing.
- Add a value range propagation optimization pass.
Use it for bounds check elimination and bitwidth reduction.
- Branch prediction: Respect hints from the frontend, add our own.

View File

@ -30,6 +30,8 @@ func schedule(f *Func) {
for _, b := range f.Blocks { for _, b := range f.Blocks {
// Find store chain for block. // Find store chain for block.
// Store chains for different blocks overwrite each other, so
// the calculated store chain is good only for this block.
for _, v := range b.Values { for _, v := range b.Values {
if v.Op != OpPhi && v.Type.IsMemory() { if v.Op != OpPhi && v.Type.IsMemory() {
for _, w := range v.Args { for _, w := range v.Args {