mirror of
https://github.com/golang/go
synced 2024-11-23 16:20:04 -07:00
83bfed916b
8 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Austin Clements
|
2010189407 |
runtime: remove legacy eager write barrier
Now that the buffered write barrier is implemented for all architectures, we can remove the old eager write barrier implementation. This CL removes the implementation from the runtime, support in the compiler for calling it, and updates some compiler tests that relied on the old eager barrier support. It also makes sure that all of the useful comments from the old write barrier implementation still have a place to live. Fixes #22460. Updates #21640 since this fixes the layering concerns of the write barrier (but not the other things in that issue). Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74 Reviewed-on: https://go-review.googlesource.com/92705 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
||
Austin Clements
|
7e343134d3 |
cmd/compile: compiler support for buffered write barrier
This CL implements the compiler support for calling the buffered write barrier added by the previous CL. Since the buffered write barrier is only implemented on amd64 right now, this still supports the old, eager write barrier as well. There's little overhead to supporting both and this way a few tests in test/fixedbugs that expect to have liveness maps at write barrier calls can easily opt-in to the old, eager barrier. This significantly improves the performance of the write barrier: name old time/op new time/op delta WriteBarrier-12 73.5ns ±20% 19.2ns ±27% -73.90% (p=0.000 n=19+18) It also reduces the size of binaries because the write barrier call is more compact: name old object-bytes new object-bytes delta Template 398k ± 0% 393k ± 0% -1.14% (p=0.008 n=5+5) Unicode 208k ± 0% 206k ± 0% -1.00% (p=0.008 n=5+5) GoTypes 1.18M ± 0% 1.15M ± 0% -2.00% (p=0.008 n=5+5) Compiler 4.05M ± 0% 3.88M ± 0% -4.26% (p=0.008 n=5+5) SSA 8.25M ± 0% 8.11M ± 0% -1.59% (p=0.008 n=5+5) Flate 228k ± 0% 224k ± 0% -1.83% (p=0.008 n=5+5) GoParser 295k ± 0% 284k ± 0% -3.62% (p=0.008 n=5+5) Reflect 1.00M ± 0% 0.99M ± 0% -0.70% (p=0.008 n=5+5) Tar 339k ± 0% 333k ± 0% -1.67% (p=0.008 n=5+5) XML 404k ± 0% 395k ± 0% -2.10% (p=0.008 n=5+5) [Geo mean] 704k 690k -2.00% name old exe-bytes new exe-bytes delta HelloSize 1.05M ± 0% 1.04M ± 0% -1.55% (p=0.008 n=5+5) https://perf.golang.org/search?q=upload:20171027.1 (Amusingly, this also reduces compiler allocations by 0.75%, which, combined with the better write barrier, speeds up the compiler overall by 2.10%. See the perf link.) It slightly improves the performance of most of the go1 benchmarks and improves the performance of the x/benchmarks: name old time/op new time/op delta BinaryTree17-12 2.40s ± 1% 2.47s ± 1% +2.69% (p=0.000 n=19+19) Fannkuch11-12 2.95s ± 0% 2.95s ± 0% +0.21% (p=0.000 n=20+19) FmtFprintfEmpty-12 41.8ns ± 4% 41.4ns ± 2% -1.03% (p=0.014 n=20+20) FmtFprintfString-12 68.7ns ± 2% 67.5ns ± 1% -1.75% (p=0.000 n=20+17) FmtFprintfInt-12 79.0ns ± 3% 77.1ns ± 1% -2.40% (p=0.000 n=19+17) FmtFprintfIntInt-12 127ns ± 1% 123ns ± 3% -3.42% (p=0.000 n=20+20) FmtFprintfPrefixedInt-12 152ns ± 1% 150ns ± 1% -1.02% (p=0.000 n=18+17) FmtFprintfFloat-12 211ns ± 1% 209ns ± 0% -0.99% (p=0.000 n=20+16) FmtManyArgs-12 500ns ± 0% 496ns ± 0% -0.73% (p=0.000 n=17+20) GobDecode-12 6.44ms ± 1% 6.53ms ± 0% +1.28% (p=0.000 n=20+19) GobEncode-12 5.46ms ± 0% 5.46ms ± 1% ~ (p=0.550 n=19+20) Gzip-12 220ms ± 1% 216ms ± 0% -1.75% (p=0.000 n=19+19) Gunzip-12 38.8ms ± 0% 38.6ms ± 0% -0.30% (p=0.000 n=18+19) HTTPClientServer-12 79.0µs ± 1% 78.2µs ± 1% -1.01% (p=0.000 n=20+20) JSONEncode-12 11.9ms ± 0% 11.9ms ± 0% -0.29% (p=0.000 n=20+19) JSONDecode-12 52.6ms ± 0% 52.2ms ± 0% -0.68% (p=0.000 n=19+20) Mandelbrot200-12 3.69ms ± 0% 3.68ms ± 0% -0.36% (p=0.000 n=20+20) GoParse-12 3.13ms ± 1% 3.18ms ± 1% +1.67% (p=0.000 n=19+20) RegexpMatchEasy0_32-12 73.2ns ± 1% 72.3ns ± 1% -1.19% (p=0.000 n=19+18) RegexpMatchEasy0_1K-12 241ns ± 0% 239ns ± 0% -0.83% (p=0.000 n=17+16) RegexpMatchEasy1_32-12 68.6ns ± 1% 69.0ns ± 1% +0.47% (p=0.015 n=18+16) RegexpMatchEasy1_1K-12 364ns ± 0% 361ns ± 0% -0.67% (p=0.000 n=16+17) RegexpMatchMedium_32-12 104ns ± 1% 103ns ± 1% -0.79% (p=0.001 n=20+15) RegexpMatchMedium_1K-12 33.8µs ± 3% 34.0µs ± 2% ~ (p=0.267 n=20+19) RegexpMatchHard_32-12 1.64µs ± 1% 1.62µs ± 2% -1.25% (p=0.000 n=19+18) RegexpMatchHard_1K-12 49.2µs ± 0% 48.7µs ± 1% -0.93% (p=0.000 n=19+18) Revcomp-12 391ms ± 5% 396ms ± 7% ~ (p=0.154 n=19+19) Template-12 63.1ms ± 0% 59.5ms ± 0% -5.76% (p=0.000 n=18+19) TimeParse-12 307ns ± 0% 306ns ± 0% -0.39% (p=0.000 n=19+17) TimeFormat-12 325ns ± 0% 323ns ± 0% -0.50% (p=0.000 n=19+19) [Geo mean] 47.3µs 46.9µs -0.67% https://perf.golang.org/search?q=upload:20171026.1 name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.25ms ± 1% 2.20ms ± 1% -2.31% (p=0.000 n=18+18) HTTP-12 12.6µs ± 0% 12.6µs ± 0% -0.72% (p=0.000 n=18+17) JSON-12 11.0ms ± 0% 11.0ms ± 1% -0.68% (p=0.000 n=17+19) https://perf.golang.org/search?q=upload:20171026.2 Updates #14951. Updates #22460. Change-Id: Id4c0932890a1d41020071bec73b8522b1367d3e7 Reviewed-on: https://go-review.googlesource.com/73712 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> |
||
David Chase
|
9c066bab64 |
cmd/compile: mark temps with new AutoTemp flag, and use it.
This is an extension of https://go-review.googlesource.com/c/31662/ to mark all the temporaries, not just the ssa-generated ones. Before-and-after ls -l `go tool -n compile` shows a 3% reduction in size (or rather, a prior 3% inflation for failing to filter temps out properly.) Replaced name-dependent "is it a temp?" tests with calls to *Node.IsAutoTmp(), which depends on AutoTemp. Also replace calls to istemp(n) with n.IsAutoTmp(), to reduce duplication and clean up function name space. Generated temporaries now come with a "." prefix to avoid (apparently harmless) clashes with legal Go variable names. Fixes #17644. Fixes #17240. Change-Id: If1417f29c79a7275d7303ddf859b51472890fd43 Reviewed-on: https://go-review.googlesource.com/32255 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> |
||
Austin Clements
|
c39918a049 |
cmd/compile: disable various write barrier optimizations
Several of our current write barrier elision optimizations are invalid with the hybrid barrier. Eliding the hybrid barrier requires that *both* the current and new pointer be already shaded and, since we don't have the flow analysis to figure out anything about the slot's current value, for now we have to just disable several of these optimizations. This has a slight impact on binary size. On linux/amd64, the go tool binary increases by 0.7% and the compile binary increases by 1.5%. It also has a slight impact on performance, as one would expect. We'll win some of this back in subsequent commits. name old time/op new time/op delta BinaryTree17-12 2.38s ± 1% 2.40s ± 1% +0.82% (p=0.000 n=18+20) Fannkuch11-12 2.84s ± 1% 2.70s ± 0% -4.97% (p=0.000 n=18+18) FmtFprintfEmpty-12 44.2ns ± 1% 46.4ns ± 2% +4.89% (p=0.000 n=16+18) FmtFprintfString-12 131ns ± 0% 134ns ± 1% +2.05% (p=0.000 n=12+19) FmtFprintfInt-12 114ns ± 1% 117ns ± 1% +3.26% (p=0.000 n=19+20) FmtFprintfIntInt-12 176ns ± 1% 181ns ± 1% +3.25% (p=0.000 n=20+20) FmtFprintfPrefixedInt-12 185ns ± 1% 190ns ± 1% +2.77% (p=0.000 n=19+18) FmtFprintfFloat-12 249ns ± 1% 254ns ± 1% +1.71% (p=0.000 n=18+20) FmtManyArgs-12 747ns ± 1% 743ns ± 1% -0.58% (p=0.000 n=19+18) GobDecode-12 6.57ms ± 1% 6.61ms ± 0% +0.73% (p=0.000 n=19+20) GobEncode-12 5.58ms ± 1% 5.60ms ± 0% +0.27% (p=0.001 n=18+18) Gzip-12 223ms ± 1% 223ms ± 1% ~ (p=0.351 n=19+20) Gunzip-12 37.9ms ± 0% 37.9ms ± 1% ~ (p=0.095 n=16+20) HTTPClientServer-12 77.8µs ± 1% 78.5µs ± 1% +0.97% (p=0.000 n=19+20) JSONEncode-12 14.8ms ± 1% 14.8ms ± 1% ~ (p=0.079 n=20+19) JSONDecode-12 53.7ms ± 1% 54.2ms ± 1% +0.92% (p=0.000 n=20+19) Mandelbrot200-12 3.81ms ± 1% 3.81ms ± 0% ~ (p=0.916 n=19+18) GoParse-12 3.19ms ± 1% 3.19ms ± 1% ~ (p=0.175 n=20+19) RegexpMatchEasy0_32-12 71.9ns ± 1% 70.6ns ± 1% -1.87% (p=0.000 n=19+20) RegexpMatchEasy0_1K-12 946ns ± 0% 944ns ± 0% -0.22% (p=0.000 n=19+16) RegexpMatchEasy1_32-12 67.3ns ± 2% 66.8ns ± 1% -0.72% (p=0.008 n=20+20) RegexpMatchEasy1_1K-12 374ns ± 1% 384ns ± 1% +2.69% (p=0.000 n=18+20) RegexpMatchMedium_32-12 107ns ± 1% 107ns ± 1% ~ (p=1.000 n=20+20) RegexpMatchMedium_1K-12 34.3µs ± 1% 34.6µs ± 1% +0.90% (p=0.000 n=20+20) RegexpMatchHard_32-12 1.78µs ± 1% 1.80µs ± 1% +1.45% (p=0.000 n=20+19) RegexpMatchHard_1K-12 53.6µs ± 0% 54.5µs ± 1% +1.52% (p=0.000 n=19+18) Revcomp-12 417ms ± 5% 391ms ± 1% -6.42% (p=0.000 n=16+19) Template-12 61.1ms ± 1% 64.2ms ± 0% +5.07% (p=0.000 n=19+20) TimeParse-12 302ns ± 1% 305ns ± 1% +0.90% (p=0.000 n=18+18) TimeFormat-12 319ns ± 1% 315ns ± 1% -1.25% (p=0.000 n=18+18) [Geo mean] 54.0µs 54.3µs +0.58% name old time/op new time/op delta XGarbage-12 2.24ms ± 2% 2.28ms ± 1% +1.68% (p=0.000 n=18+17) XHTTP-12 11.4µs ± 1% 11.6µs ± 2% +1.63% (p=0.000 n=18+18) XJSON-12 11.6ms ± 0% 12.5ms ± 0% +7.84% (p=0.000 n=18+17) Updates #17503. Change-Id: I1899f8e35662971e24bf692b517dfbe2b533c00c Reviewed-on: https://go-review.googlesource.com/31572 Reviewed-by: Keith Randall <khr@golang.org> |
||
Austin Clements
|
ad5fd2872f |
test: simplify fixedbugs/issue15747.go
The error check patterns in this test are more complex than necessary because f2 gets inlined into f1. This behavior isn't important to the test, so disable inlining of f2 and simplify the error check patterns. Change-Id: Ia8aee92a52f9217ad71b89b2931494047e8d2185 Reviewed-on: https://go-review.googlesource.com/31132 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> |
||
Keith Randall
|
ca4089ad62 |
cmd/compile: args no longer live until end-of-function
We're dropping this behavior in favor of runtime.KeepAlive. Implement runtime.KeepAlive as an intrinsic. Update #15843 Change-Id: Ib60225bd30d6770ece1c3c7d1339a06aa25b1cbc Reviewed-on: https://go-review.googlesource.com/28310 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> |
||
David Chase
|
5b9ff11c3d |
cmd/compile: ppc64le working, not optimized enough
This time with the cherry-pick from the proper patch of the old CL. Stack size increased. Corrected NaN-comparison glitches. Marked g register as clobbered by calls. Fixed shared libraries. live_ssa.go still disabled because of differences. Presumably turning on more optimization will fix both the stack size and the live_ssa.go glitches. Enhanced debugging output for shared libs test. Rebased onto master. Updates #16010. Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4 Reviewed-on: https://go-review.googlesource.com/27159 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> |
||
Russ Cox
|
b6dc3e6f66 |
cmd/compile: fix liveness computation for heap-escaped parameters
The liveness computation of parameters generally was never
correct, but forcing all parameters to be live throughout the
function covered up that problem. The new SSA back end is
too clever: even though it currently keeps the parameter values live
throughout the function, it may find optimizations that mean
the current values are not written back to the original parameter
stack slots immediately or ever (for example if a parameter is set
to nil, SSA constant propagation may replace all later uses of the
parameter with a constant nil, eliminating the need to write the nil
value back to the stack slot), so the liveness code must now
track the actual operations on the stack slots, exposing these
problems.
One small problem in the handling of arguments is that nodarg
can return ONAME PPARAM nodes with adjusted offsets, so that
there are actually multiple *Node pointers for the same parameter
in the instruction stream. This might be possible to correct, but
not in this CL. For now, we fix this by using n.Orig instead of n
when considering PPARAM and PPARAMOUT nodes.
The major problem in the handling of arguments is general
confusion in the liveness code about the meaning of PPARAM|PHEAP
and PPARAMOUT|PHEAP nodes, especially as contrasted with PAUTO|PHEAP.
The difference between these two is that when a local variable "moves"
to the heap, it's really just allocated there to start with; in contrast,
when an argument moves to the heap, the actual data has to be copied
there from the stack at the beginning of the function, and when a
result "moves" to the heap the value in the heap has to be copied
back to the stack when the function returns
This general confusion is also present in the SSA back end.
The PHEAP bit worked decently when I first introduced it 7 years ago (!)
in
|