This CL implements the remainder of the zero-copy string->[]byte
conversion optimization initially attempted in go.dev/cl/520395, but
fixes the tracking of mutations due to ODEREF/ODOTPTR assignments, and
adds more comprehensive tests that I should have included originally.
However, this CL also keeps it behind the -d=zerocopy flag. The next
CL will enable it by default (for easier rollback).
Updates #2205.
Change-Id: Ic330260099ead27fc00e2680a59c6ff23cb63c2b
Reviewed-on: https://go-review.googlesource.com/c/go/+/520599
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
This CL extends escape analysis in two ways.
First, we already optimize directly called closures. For example,
given:
var x int // already stack allocated today
p := func() *int { return &x }()
we don't need to move x to the heap, because we can statically track
where &x flows. This CL extends the same idea to work for indirectly
called closures too, as long as we know everywhere that they're
called. For example:
var x int // stack allocated after this CL
f := func() *int { return &x }
p := f()
This will allow a subsequent CL to move the generation of go/defer
wrappers earlier.
Second, this CL adds tracking to detect when pointer values flow to
the pointee operand of an indirect assignment statement (i.e., flows
to p in "*p = x") or to builtins that modify memory (append, copy,
clear). This isn't utilized in the current CL, but a subsequent CL
will make use of it to better optimize string->[]byte conversions.
Updates #2205.
Change-Id: I610f9c531e135129c947684833e288ce64406f35
Reviewed-on: https://go-review.googlesource.com/c/go/+/520259
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
When -m=N (where N > 1) is in effect, include a note in the trace
output if a given function is considered "big" during inlining
analysis, since this causes the inliner to be less aggressive. If a
small change to a large function happens to nudge it over the large
function threshold, it can be confusing for developers, thus it's
probably worth including this info in the remark output.
Change-Id: Id31a1b76371ab1ef9265ba28a377f97b0247d0a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/460317
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Keith Randall <khr@google.com>
This CL refactors mkinlcall by extracting the core InlinedCallExpr
construction code into a new "oldInline" function, and adds a new
"NewInline" hook point that can be overriden with a new inliner
implementation that only needs to worry about the details of
constructing the InlinedCallExpr.
It also moves the delayretvars optimization check into CanInline, so
it's performed just once per inlinable function rather than once for
each inlined call.
Finally, it skips printing the function body about to be inlined (and
updates the couple of regress tests that expected this output). We
already report the inline body as it was saved, and this diagnostic is
only applicable to the current inliner, which clones existing function
body instances. In the unified IR inliner, we'll directly construct
inline bodies from the serialized representation.
Change-Id: Ibdbe617da83c07665dcbda402cc8d4d4431dde2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/323290
Trust: Matthew Dempsky <mdempsky@google.com>
Trust: Dan Scales <danscales@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Dan Scales <danscales@google.com>
The existing implementation outputs inline cost iff function cannot be inlined with Debug['m'] > 1, the cost info is also useful if the function is inlineable.
Fixes#36780
Change-Id: Ic96f6baf96aee25fb4b33d31d4d644dc2310e536
Reviewed-on: https://go-review.googlesource.com/c/go/+/216778
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This information is redundant with the position information already
provided. Also, no other -m diagnostics print out function name.
While here, report parameter leak diagnostics against the parameter
declaration position rather than the function, and use Warnl for
"moved to heap" messages.
Test cases updated programmatically by removing the first word from
every "no match for" error emitted by run.go:
go run run.go |& \
sed -E -n 's/^(.*):(.*): no match for `([^ ]* (.*))` in:$/\1!\2!\3!\4/p' | \
while IFS='!' read -r fn line before after; do
before=$(echo "$before" | sed 's/[.[\*^$()+?{|]/\\&/g')
after=$(echo "$after" | sed -E 's/(\&|\\)/\\&/g')
fn=$(find . -name "${fn}" | head -1)
sed -i -E -e "${line}s/\"${before}\"/\"${after}\"/" "${fn}"
done
Passes toolstash-check.
Change-Id: I6e02486b1409e4a8dbb2b9b816d22095835426b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/195040
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
If we're compiling a large function, be more picky about how big
the function we're inlining is. If the function is >5000 nodes,
we lower the inlining threshold from a cost of 80 to 20.
Turns out reflect.Value's cost is exactly 80. That's the function
at issue in #26546.
20 was chosen as a proxy for "inlined body is smaller than the call would be".
Simple functions still get inlined, like this one at cost 7:
func ifaceIndir(t *rtype) bool {
return t.kind&kindDirectIface == 0
}
5000 nodes was chosen as the big function size. Here are all the
5000+ node (~~1000+ lines) functions in the stdlib:
5187 cmd/internal/obj/arm (*ctxt5).asmout
6879 cmd/internal/obj/s390x (*ctxtz).asmout
6567 cmd/internal/obj/ppc64 (*ctxt9).asmout
9643 cmd/internal/obj/arm64 (*ctxt7).asmout
5042 cmd/internal/obj/x86 (*AsmBuf).doasm
8768 cmd/compile/internal/ssa rewriteBlockAMD64
8878 cmd/compile/internal/ssa rewriteBlockARM
8344 cmd/compile/internal/ssa rewriteValueARM64_OpARM64OR_20
7916 cmd/compile/internal/ssa rewriteValueARM64_OpARM64OR_30
5427 cmd/compile/internal/ssa rewriteBlockARM64
5126 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_50
6152 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_60
6412 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_70
6486 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_80
6534 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_90
6534 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_100
6534 cmd/compile/internal/ssa rewriteValuePPC64_OpPPC64OR_110
6675 cmd/compile/internal/gc typecheck1
5433 cmd/compile/internal/gc walkexpr
14070 cmd/vendor/golang.org/x/arch/arm64/arm64asm decodeArg
There are a lot more smaller (~1000 node) functions in the stdlib.
The function in #26546 has 12477 nodes.
At some point it might be nice to have a better heuristic for "inlined
body is smaller than the call", a non-cliff way to scale down the cost
as the function gets bigger, doing cheaper inlined calls first, etc.
All that can wait for another release. I'd like to do this CL for
1.11.
Fixes#26546
Update #17566
Change-Id: Idda13020e46ec2b28d79a17217f44b189f8139ac
Reviewed-on: https://go-review.googlesource.com/125516
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>