mirror of
https://github.com/golang/go
synced 2024-10-05 20:31:20 -06:00
[dev.ssa] cmd/compile/internal/ssa: Update TODO list
Change-Id: Ibcd4c6984c8728fd9ab76e0c7df555984deaf281 Reviewed-on: https://go-review.googlesource.com/13471 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This commit is contained in:
parent
baf2c3ec4a
commit
9787ba43ee
@ -1,71 +1,70 @@
|
|||||||
This is a list of things that need to be worked on. It is by no means complete.
|
This is a list of things that need to be worked on. It will hopefully
|
||||||
|
be complete soon.
|
||||||
|
|
||||||
Allocation
|
Coverage
|
||||||
- Allocation of decls in stackalloc. Decls survive if they are
|
--------
|
||||||
addrtaken or are too large for registerization.
|
- Floating point numbers
|
||||||
|
- Complex numbers
|
||||||
|
- Integer division
|
||||||
|
- Fat objects (strings/slices/interfaces) vs. Phi
|
||||||
|
- Defer?
|
||||||
|
- Closure args
|
||||||
|
- PHEAP vars
|
||||||
|
|
||||||
Scheduling
|
Correctness
|
||||||
- Make sure loads are scheduled correctly with respect to stores.
|
-----------
|
||||||
Same for flag type values. We can't have more than one value of
|
- GC maps
|
||||||
mem or flag types live at once.
|
- Write barriers
|
||||||
- Reduce register pressure. Schedule instructions which kill
|
- Debugging info
|
||||||
variables first.
|
- Handle flags register correctly (clobber/spill/restore)
|
||||||
|
- Proper panic edges from checks & calls (+deferreturn)
|
||||||
|
- Can/should we move control values out of their basic block?
|
||||||
|
- Anything to do for the race detector?
|
||||||
|
- Slicing details (avoid ptr to next object)
|
||||||
|
|
||||||
Values
|
Optimizations (better compiled code)
|
||||||
- Store *Type instead of Type? Keep an array of used Types in Func
|
------------------------------------
|
||||||
and reference by id? Unify with the type ../gc so we just use a
|
- Reduce register pressure in scheduler
|
||||||
pointer instead of an interface?
|
- More strength reduction: multiply -> shift/add combos (Worth doing?)
|
||||||
- Recycle dead values instead of using GC to do that.
|
- Strength reduction: constant divides -> multiply
|
||||||
- A lot of Aux fields are just int64. Add a separate AuxInt field?
|
- Expand current optimizations to all bit widths
|
||||||
If not that, then cache the interfaces that wrap int64s.
|
- Nil/bounds check removal
|
||||||
- OpStore uses 3 args. Increase the size of argstorage to 3?
|
- Combining nil checks with subsequent load
|
||||||
|
- Implement memory zeroing with REPSTOSQ and DuffZero
|
||||||
|
- Implement memory copying with REPMOVSQ and DuffCopy
|
||||||
|
- Make deadstore work with zeroing
|
||||||
|
- Branch prediction: Respect hints from the frontend, add our own
|
||||||
|
- Add a value range propagation pass (for bounds elim & bitwidth reduction)
|
||||||
|
- Stackalloc: group pointer-containing variables & spill slots together
|
||||||
|
- Stackalloc: organize values to allow good packing
|
||||||
|
- Regalloc: use arg slots as the home for arguments (don't copy args to locals)
|
||||||
|
- Reuse stack slots for noninterfering & compatible values (but see issue 8740)
|
||||||
|
- (x86) Combine loads into other ops
|
||||||
|
- (x86) More combining address arithmetic into loads/stores
|
||||||
|
|
||||||
|
Optimizations (better compiler)
|
||||||
|
-------------------------------
|
||||||
|
- Smaller Value.Type (int32 or ptr)? Get rid of types altogether?
|
||||||
|
- Recycle dead Values (and Blocks) explicitly instead of using GC
|
||||||
|
- OpStore uses 3 args. Increase the size of Value.argstorage to 3?
|
||||||
|
- Constant cache
|
||||||
|
- Reuseable slices (e.g. []int of size NumValues()) cached in Func
|
||||||
|
|
||||||
Regalloc
|
Regalloc
|
||||||
- Make less arch-dependent
|
--------
|
||||||
- Don't spill everything at every basic block boundary.
|
- Make less arch-dependent
|
||||||
- Allow args and return values to be ssa-able.
|
- Don't spill everything at every basic block boundary
|
||||||
- Handle 2-address instructions.
|
- Allow args and return values to be ssa-able
|
||||||
- Floating point registers
|
- Handle 2-address instructions
|
||||||
- Make calls clobber all registers
|
- Make calls clobber all registers
|
||||||
- Make liveness analysis non-quadratic.
|
- Make liveness analysis non-quadratic
|
||||||
- Handle in-place instructions (like XORQconst) directly:
|
- Materialization of constants
|
||||||
Use XORQ AX, 1 rather than MOVQ AX, BX; XORQ BX, 1.
|
|
||||||
|
|
||||||
StackAlloc:
|
Future/other
|
||||||
- Sort variables so all ptr-containing ones are first (so stack
|
------------
|
||||||
maps are smaller)
|
- Start another architecture (arm?)
|
||||||
- Reuse stack slots for noninterfering and type-compatible variables
|
- 64-bit ops on 32-bit machines
|
||||||
(both AUTOs and spilled Values). But see issue 8740 for what
|
- Investigate type equality. During SSA generation, should we use n.Type or (say) TypeBool?
|
||||||
"type-compatible variables" mean and what DWARF information provides.
|
- Should we get rid of named types in favor of underlying types during SSA generation?
|
||||||
|
- Should we introduce a new type equality routine that is less strict than the frontend's?
|
||||||
Rewrites
|
- Infrastructure for enabling/disabling/configuring passes
|
||||||
- Strength reduction (both arch-indep and arch-dependent?)
|
|
||||||
- Start another architecture (arm?)
|
|
||||||
- 64-bit ops on 32-bit machines
|
|
||||||
- <regwidth ops. For example, x+y on int32s on amd64 needs (MOVLQSX (ADDL x y)).
|
|
||||||
Then add rewrites like (MOVLstore (MOVLQSX x) m) -> (MOVLstore x m)
|
|
||||||
to get rid of most of the MOVLQSX.
|
|
||||||
- Determine which nil checks can be done implicitly (by faulting)
|
|
||||||
and which need code generated, and do the code generation.
|
|
||||||
|
|
||||||
Common-Subexpression Elimination
|
|
||||||
- Make better decision about which value in an equivalence class we should
|
|
||||||
choose to replace other values in that class.
|
|
||||||
- Can we move control values out of their basic block?
|
|
||||||
This would break nilcheckelim as currently implemented,
|
|
||||||
but it could be replaced by a similar CFG simplication pass.
|
|
||||||
- Investigate type equality. During SSA generation, should we use n.Type or (say) TypeBool?
|
|
||||||
Should we get rid of named types in favor of underlying types during SSA generation?
|
|
||||||
Should we introduce a new type equality routine that is less strict than the frontend's?
|
|
||||||
|
|
||||||
Other
|
|
||||||
- Write barriers
|
|
||||||
- For testing, do something more sophisticated than
|
|
||||||
checkOpcodeCounts. Michael Matloob suggests using a similar
|
|
||||||
pattern matcher to the rewrite engine to check for certain
|
|
||||||
expression subtrees in the output.
|
|
||||||
- Implement memory zeroing with REPSTOSQ and DuffZero
|
|
||||||
- make deadstore work with zeroing.
|
|
||||||
- Add a value range propagation optimization pass.
|
|
||||||
Use it for bounds check elimination and bitwidth reduction.
|
|
||||||
- Branch prediction: Respect hints from the frontend, add our own.
|
|
||||||
|
@ -30,6 +30,8 @@ func schedule(f *Func) {
|
|||||||
|
|
||||||
for _, b := range f.Blocks {
|
for _, b := range f.Blocks {
|
||||||
// Find store chain for block.
|
// Find store chain for block.
|
||||||
|
// Store chains for different blocks overwrite each other, so
|
||||||
|
// the calculated store chain is good only for this block.
|
||||||
for _, v := range b.Values {
|
for _, v := range b.Values {
|
||||||
if v.Op != OpPhi && v.Type.IsMemory() {
|
if v.Op != OpPhi && v.Type.IsMemory() {
|
||||||
for _, w := range v.Args {
|
for _, w := range v.Args {
|
||||||
|
Loading…
Reference in New Issue
Block a user