We're using bufio to batch reads of /dev/urandom to 4k, but we weren't
doing the same on newer platforms with getrandom/getentropy. Since the
overhead is the same for these -- one syscall -- we should batch reads
of these into the same 4k buffer. While we're at it, we can simplify a
lot of the constant dispersal.
This also adds a new test case to make sure the buffering works as
desired.
Change-Id: I7297d4aa795c00712e6484b841cef8650c2be4ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/370894
Reviewed-by: Filippo Valsorda <valsorda@google.com>
Run-TryBot: Jason Donenfeld <Jason@zx2c4.com>
Auto-Submit: Jason Donenfeld <Jason@zx2c4.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
The x + (-y) => x - y rule is hitted 75 times while building stage 3 and tools
and make the linux-amd64 go binary 0.007% smaller.
It transform:
NEG AX
ADD BX, AX
Into:
SUB BX, AX
Which is 2X faster (assuming this assembly in a vacum).
The x ^ (-1) => ^x rule is not hitted in the toolchain.
It transforms:
XOR $-1, AX
Into:
NOT AX
Which is more compact as it doesn't encode the immediate.
Cache usage aside, this does not affect performance
(assuming this assembly in a vacum).
On my ryzen 3600, with some surrouding code, this randomly might be 2X faster,
I guess this has to do with loading the immediate into a temporary register.
Combined to an other rule that already exists it also rewrite manual two's
complement negation from:
XOR $-1, AX
INC AX
Into:
NEG AX
Which is 2X faster.
The other rules just eliminates similar trivial cases and help constants
folding.
This should generalise to other architectures.
Change-Id: Ia1e51b340622e7ed88e5d856f3b1aa424aa039de
GitHub-Last-Rev: ce35ff2efd
GitHub-Pull-Request: golang/go#52395
Reviewed-on: https://go-review.googlesource.com/c/go/+/400714
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Change-Id: If2887f84b3d14fac3c059fc5bad4186ec9d69d0f
Reviewed-on: https://go-review.googlesource.com/c/go/+/401077
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Even though the change in the behavior of 'runtime.GOROOT' was
not actually due to a change in the runtime package proper, I
suspect that users who notice it will look for the release note
in that section, not the 'cmd/go' section.
Fixes#51461.
Change-Id: I271752968d4152a7fdf3e170537e3072bf87ce86
Reviewed-on: https://go-review.googlesource.com/c/go/+/400814
Reviewed-by: Ian Lance Taylor <iant@google.com>
All APIs in the now-deprecated io/ioutil package have a direct
replacement in either the io or os package with the same signature,
with the notable exception of ioutil.ReadDir, as os.ReadDir has a
slightly different signature with fs.DirEntry rather than fs.FileInfo.
New code can easily make use of []fs.DirEntry directly,
but existing code may need to continue using []fs.FileInfo for backwards
compatibility reasons. For instance, I had a bit of code that exposed
the slice as a public API, like:
return ioutil.ReadDir(name)
It took me a couple of minutes to figure out what the exact equivalent
in terms of os.ReadDir would be, and a code sample would have helped.
Add one for future reference.
For #42026.
For #51927.
Change-Id: I76d46cd7d68fc609c873821755fdcfc299ffd56c
Reviewed-on: https://go-review.googlesource.com/c/go/+/399854
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
The linker performs a global analysis of all nosplit call chains to
check they fit in the stack space ensured by splittable functions.
That analysis has two problems right now:
1. It's inefficient. It performs a top-down analysis, starting with
every nosplit function and the nosplit stack limit and walking *down*
the call graph to compute how much stack remains at every call. As a
result, it visits the same functions over and over, often with
different remaining stack depths. This approach is historical: this
check was originally written in C and this approach avoided the need
for any interesting data structures.
2. If some call chain is over the limit, it only reports a single call
chain. As a result, if the check does fail, you often wind up playing
whack-a-mole by guessing where the problem is in the one chain, trying
to reduce the stack size, and then seeing if the link works or reports
a different path.
This CL completely rewrites the nosplit stack check. It now uses a
bottom-up analysis, computing the maximum stack height required by
every function's call tree. This visits every function exactly once,
making it much more efficient. It uses slightly more heap space for
intermediate storage, but still very little in the scheme of the
overall link. For example, when linking cmd/go, the new algorithm
virtually eliminates the time spent in this pass, and reduces overall
link time:
│ before │ after │
│ sec/op │ sec/op vs base │
Dostkcheck 7.926m ± 4% 1.831m ± 6% -76.90% (p=0.000 n=20)
TotalTime 301.3m ± 1% 296.4m ± 3% -1.62% (p=0.040 n=20)
│ before │ after │
│ B/op │ B/op vs base │
Dostkcheck 40.00Ki ± 0% 212.15Ki ± 0% +430.37% (p=0.000 n=20)
Most of this time is spent analyzing the runtime, so for larger
binaries, the total time saved is roughly the same, and proportionally
less of the overall link.
If the new implementation finds an error, it redoes the analysis,
switching to preferring quality of error reporting over performance.
For error reporting, it computes stack depths top-down (like the old
algorithm), and reports *all* paths that are over the stack limit,
presented as a tree for compactness. For example, this is the output
from a simple test case from test/nosplit with two over-limit paths
from f1:
main.f1: nosplit stack overflow
main.f1
grows 768 bytes, calls main.f2
grows 56 bytes, calls main.f4
grows 48 bytes
80 bytes over limit
grows 768 bytes, calls main.f3
grows 104 bytes
80 bytes over limit
While we're here, we do a few nice cleanups:
- We add a debug output flag, which will be useful for understanding
what our nosplit chains look like and which ones are close to
running over.
- We move the implementation out of the fog of lib.go to its own file.
- The implementation is generally more Go-like and less C-like.
Change-Id: If1ab31197f5215475559b93695c44a01bd16e276
Reviewed-on: https://go-review.googlesource.com/c/go/+/398176
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
The nosplit test was originally written when the stack limit was a
mere 128 bytes. Now it's much larger, but rather than rewriting all of
the tests, we apply a hack to just add the extra space into the stack
frames of the existing tests.
Unfortunately, we add it in the wrong place. The extra space should be
added just once per chain of nosplit functions, but instead we add it
to every frame that appears first on a line in the test's little
script language. This means that for tests like
start 0 call f1
f1 16 nosplit call f2
f2 16 nosplit call f3
f3 16 nosplit call f4
f4 16 nosplit call f5
f5 16 nosplit call f6
f6 16 nosplit call f7
f7 16 nosplit call f8
f8 16 nosplit call end
end 1000
REJECT
we add 672 bytes to *every* frame, meaning that we wind up way over
the stack limit by the end of the stanza, rather than just a little as
originally intended.
Fix this by instead adding the extra space to the first nosplit
function in a stanza. This isn't perfect either, since we could have a
nosplit -> split -> nosplit chain, but it's the best we can do without
a graph analysis.
Change-Id: Ibf156c68fe3eb1b64a438115f4a17f1a6c7e2bd1
Reviewed-on: https://go-review.googlesource.com/c/go/+/398174
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Storing this information in the Arch eliminates some code duplication
between the compiler and linker. This information is entirely
determined by the Arch, so the current approach of attaching it to an
entire Ctxt is a little silly. This will also make it easier to use
this information from tests.
The next CL will be a rote refactoring to eliminate the
Ctxt.FixedFrameSize methods.
Change-Id: I315c524fa66a0ea99f63ae5a2a6fdc367d843bad
Reviewed-on: https://go-review.googlesource.com/c/go/+/400818
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
When linking a PIE binary with the internal linker, TOC relative
relocations need to be generated. Update trampolines to indirect
call using R12 to more closely match the AIX/ELFv2 regardless of
buildmode, and work with position-indepdent code.
Likewise, update the check for offseting R_CALLPOWER relocs to
make a local call. It should be checking ldr.AttrExternal, not
ldr.IsExternal. This offset should not be adjusted for external
(non-go) object files, it is handled when ELF reloc are translated
into go relocs.
And, update trampoline tests to verify these are generated correctly
and produce a working binary using -buildmode=pie on ppc64le.
Fixes#52337
Change-Id: I8a2dea06c3237bdf0e87888b56a17b6c4c99a7de
Reviewed-on: https://go-review.googlesource.com/c/go/+/400234
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This allows the caller to decide whether MapIter should be
stack allocated or heap allocated based on whether it escapes.
In most cases, it does not escape and thus removes the utility
of MapIter.Reset (#46293). In fact, use of sync.Pool with MapIter
and calling MapIter.Reset is likely to be slower.
Change-Id: Ic93e7d39e5dd4c83e7fca9e0bdfbbcd70777f0e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/400675
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
These timeouts are empirically sometimes (but rarely) too short on
slower builders, and at any rate if this test fails “for real” we'll
want a goroutine dump in order to debug it anyway. A goroutine dump is
exactly what we get if we let the test time out on its own.
Fixes#52414.
Change-Id: Id2dd3839977bd8a41f296d67d1cccbf068fd73f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/400816
Run-TryBot: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Endlineno is lost when we call "genericSubst" to create the new
instantiation of the generic function. This will cause "readFuncLines"
to fail to read the target function.
To fix this issue, as @mdempsky pointed out, add the line in
cmd/compile/internal/noder/stencil.go:
newf.Endlineno = gf.Endlineno
Fixes#51988
Change-Id: Ib408e4ed0ceb68df8dedda4fb551309e8385aada
Reviewed-on: https://go-review.googlesource.com/c/go/+/399057
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
This improves performance for memclr for sizes >= 64 and < 512 by
unrolling the loop to clear 64 bytes at a time, whereas before it was
doing 32 bytes.
On a power9, the improvement is:
Memclr/64 6.07ns ± 0% 5.17ns ± 0% -14.86% (p=1.000 n=1+1)
Memclr/256 11.8ns ± 0% 8.3ns ± 0% -30.10% (p=1.000 n=1+1)
GoMemclr/64 5.58ns ± 0% 5.02ns ± 0% -10.04% (p=1.000 n=1+1)
GoMemclr/256 12.0ns ± 0% 8.8ns ± 0% -26.62% (p=1.000 n=1+1)
Change-Id: I929389ae9e50128cba81e0c412e7ba431da7facc
Reviewed-on: https://go-review.googlesource.com/c/go/+/399895
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
actualReader and tc.expectedReader are reflect.Type instances,
not the actual objects.
Change-Id: I7c9cfa489e3297b94c603b62bad1ed84bd207057
GitHub-Last-Rev: d581402375
GitHub-Pull-Request: golang/go#52339
Reviewed-on: https://go-review.googlesource.com/c/go/+/400235
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
This allows the result of Type to be computed much faster.
Performance:
old new delta
1.76ns 0.66ns -62.27%
Change-Id: Ie007fd175aaa41b2f67c71fa2a34ab8d292dd0e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/400335
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
As is already done in strings package.
Change-Id: Ia45e6443ddf6beac5e70a1cc493119030e173139
GitHub-Last-Rev: 1174c25035
GitHub-Pull-Request: golang/go#52348
Reviewed-on: https://go-review.googlesource.com/c/go/+/400239
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
This code was trying to iterate codepoints, but didn't reslice the string,
so it was reading the first codepoint over and over, if the string length was
not a multiple of the first codepoint length, this would cause to overshoot
past the end of the string.
This was a latent bug introduced in CL 384265 but was revealed to
Ngolo-fuzzing in OSS-Fuzz in CL 397277.
Fixes#52353
Change-Id: I13f0352e6ad13a42878927f3b1c18c58360dd40c
GitHub-Last-Rev: 424f6cfad1
GitHub-Pull-Request: golang/go#52356
Reviewed-on: https://go-review.googlesource.com/c/go/+/400240
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
No test because we already have a test in the syscall package.
The issue reports 1 failure per 100,000 iterations, which is rare enough
that our builders won't catch the problem.
Fixes#52226
Change-Id: I17633ff6cf676b6d575356186dce42cdacad0746
Reviewed-on: https://go-review.googlesource.com/c/go/+/400315
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
There are some symbol attributes that are encoded in the object
file. Currently, they are lost when cloning a symbol to external.
Copy them over.
Also delete CopyAttributes as it is no longer called anywhere.
Change-Id: I1497e3223a641704bf35aa3e904dd0eda2f8ec3e
Reviewed-on: https://go-review.googlesource.com/c/go/+/400574
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
No separate test because this makes no difference for valid PE files.
Fixes#52350
Change-Id: I2aa011a4e8b34cb08052222e94c52627ebe99fbf
Reviewed-on: https://go-review.googlesource.com/c/go/+/400378
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
The current implementation, although more succinct, relies on a runtime
lookup to a "constant" unexported map (which also needs to be
initialized at runtime).
The proposed implementation is able to be optimized by the compiler at
build-time, resulting in *much* more efficient instructions.
Additionally, unused string literals may even be removed altogether
from the generated binary in some cases.
This change is fully backwards-compatible behavior-wise with the
existing implementation.
Change-Id: I36450320aacff5b322195820552f2831d4fecd52
GitHub-Last-Rev: e2058f132e
GitHub-Pull-Request: golang/go#49811
Reviewed-on: https://go-review.googlesource.com/c/go/+/367201
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Refuse to create certificates with negative serial numbers, as they
are explicitly disallowed by RFC 5280.
We still allow parsing certificates with negative serial numbers,
because in the past there were buggy CA implementations which would
produce them (although there are currently *no* trusted certificates
that have this issue). We may want to revisit this decision if we can
find metrics about the prevalence of this issue in enterprise settings.
Change-Id: I131262008db99b6354f542f335abc68775a2d6d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/400494
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Run-TryBot: Roland Shoemaker <roland@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Fixes#52239
Change-Id: I08b75e613e3c976855e39d01a6757d94e4207bf8
Reviewed-on: https://go-review.googlesource.com/c/go/+/399155
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
This adds a straight-forward implementation of the functionality.
A more performant version could be added that unrolls the loop
as is done in google.golang.org/protobuf/encoding/protowire,
but usages that demand high performance can use that package instead.
Fixes#51644
Change-Id: I9d3b615a60cdff47e5200e7e5d2276adf4c93783
Reviewed-on: https://go-review.googlesource.com/c/go/+/400176
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
TryBot-Result: Gopher Robot <gobot@golang.org>
In issue #17671, there are a endless loop if printing
the panic value panics, CL 30358 has fixed that.
As issue #52257 pointed out, above change should not
discard the value from panic while panicking.
With this CL, when we recover from a panic in error.Error()
or stringer.String(), and the recovered value is string,
then we can print it normally.
Fixes#52257
Change-Id: Icfcc4a1a390635de405eea04904b4607ae9e3055
Reviewed-on: https://go-review.googlesource.com/c/go/+/399874
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
The noopt builder is broken, because with -N we get two OpSB opcodes
(one for the function as a whole, one introduced by the jumptable
rewrite rule), and they fight each other for a register.
Without -N, the two OpSB get CSEd, so optimized builds are ok.
Maybe we fix regalloc to deal with this case, but it's simpler
(and maybe more correct?) to disable jump tables with -N.
Change-Id: I75c87f12de6262955d1df787f47c53de976f8a5f
Reviewed-on: https://go-review.googlesource.com/c/go/+/400455
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Don't create certificates that have serial numbers that are longer
than 20 octets (when encoded), since these are explicitly disallowed
by RFC 5280.
Change-Id: I292b7001f45bed0971b2d519b6de26f0b90860ae
Reviewed-on: https://go-review.googlesource.com/c/go/+/400377
Reviewed-by: Roland Shoemaker <roland@golang.org>
Run-TryBot: Roland Shoemaker <roland@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reorganize the way we rewrite expression switches on strings, so that
jump tables are naturally used for the outer switch on the string length.
The changes to the prove pass in this CL are required so as to not repeat
the test for string length in each case.
name old time/op new time/op delta
SwitchStringPredictable 2.28ns ± 9% 2.08ns ± 5% -9.04% (p=0.000 n=10+10)
SwitchStringUnpredictable 10.5ns ± 1% 9.5ns ± 1% -9.08% (p=0.000 n=9+10)
Update #5496
Update #34381
Change-Id: Ie6846b1dd27f3e472f7c30dfcc598c68d440b997
Reviewed-on: https://go-review.googlesource.com/c/go/+/395714
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
So that the inliner knows all the other cases are dead and doesn't
accumulate any cost for them.
The canonical case for this is switching on runtime.GOOS, which occurs
several places in the stdlib.
Fixes#50253
Change-Id: I44823aaebb6c1b03c9b0c12d10086db81954350f
Reviewed-on: https://go-review.googlesource.com/c/go/+/399694
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Performance is kind of hard to exactly quantify.
One big difference between jump tables and the old binary search
scheme is that there's only 1 branch statement instead of O(n) of
them. That can be both a blessing and a curse, and can make evaluating
jump tables very hard to do.
The single branch can become a choke point for the hardware branch
predictor. A branch table jump must fit all of its state in a single
branch predictor entry (technically, a branch target predictor entry).
With binary search that predictor state can be spread among lots of
entries. In cases where the case selection is repetitive and thus
predictable, binary search can perform better.
The big win for a jump table is that it doesn't consume so much of the
branch predictor's resources. But that benefit is essentially never
observed in microbenchmarks, because the branch predictor can easily
keep state for all the binary search branches in a microbenchmark. So
that benefit is really hard to measure.
So predictable switch microbenchmarks are ~useless - they will almost
always favor the binary search scheme. Fully unpredictable switch
microbenchmarks are better, as they aren't lying to us quite so
much. In a perfectly unpredictable situation, a jump table will expect
to incur 1-1/N branch mispredicts, where a binary search would incur
lg(N)/2 of them. That makes the crossover point at about N=4. But of
course switches in real programs are seldom fully unpredictable, so
we'll use a higher crossover point.
Beyond the branch predictor, jump tables tend to execute more
instructions per switch but have no additional instructions per case,
which also argues for a larger crossover.
As far as code size goes, with this CL cmd/go has a slightly smaller
code segment and a slightly larger overall size (from the jump tables
themselves which live in the data segment).
This is a case where some FDO (feedback-directed optimization) would
be really nice to have. #28262
Some large-program benchmarks might help make the case for this
CL. Especially if we can turn on branch mispredict counters so we can
see how much using jump tables can free up branch prediction resources
that can be gainfully used elsewhere in the program.
name old time/op new time/op delta
Switch8Predictable 1.89ns ± 2% 1.27ns ± 3% -32.58% (p=0.000 n=9+10)
Switch8Unpredictable 9.33ns ± 1% 7.50ns ± 1% -19.60% (p=0.000 n=10+9)
Switch32Predictable 2.20ns ± 2% 1.64ns ± 1% -25.39% (p=0.000 n=10+9)
Switch32Unpredictable 10.0ns ± 2% 7.6ns ± 2% -24.04% (p=0.000 n=10+10)
Fixes#5496
Update #34381
Change-Id: I3ff56011d02be53f605ca5fd3fb96b905517c34f
Reviewed-on: https://go-review.googlesource.com/c/go/+/357330
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Check for insane statement list attribute values when
constructing LineReader's for a compilation unit.
Fixes#52354.
Change-Id: Icb5298db31f6c5fe34c44e0ed4fe277a7cd676b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/400255
Run-TryBot: Than McIntosh <thanm@google.com>
Auto-Submit: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Name the arguments in a way that is more self-describing.
Many code editor tools show a snippet of the function and
its arguments. However, "x" and "y" are not helpful in determining
which is the sign and which is the magnitude,
short of reading the documentation itself.
Name the sign argument as "sign" to be explicit.
This follows the same naming convention as IsInf.
Change-Id: Ie3055009e475f96c92d5ea7bfe9828eed908c78b
Reviewed-on: https://go-review.googlesource.com/c/go/+/400177
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
We should prefer a constant shift op to a X shift op.
That way we don't have to materialize the constant to shift by.
Should fix GOAMD64=v3 builder
Change-Id: I56b45d2940c959382b970e3f962ed4a09cc2a239
Reviewed-on: https://go-review.googlesource.com/c/go/+/400254
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
This offR accumulation isn't used and some really similar code is done
later in the Load results block.
Change-Id: I2f77a7bfd568e7e5eb9fc519e7c552401b3af9b8
GitHub-Last-Rev: 2c91e5c898
GitHub-Pull-Request: golang/go#52316
Reviewed-on: https://go-review.googlesource.com/c/go/+/400094
Run-TryBot: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Otherwise we panic if either pool is nil.
Change-Id: I8598e3c0f3a5294135f1c330e319128d552ebb67
Reviewed-on: https://go-review.googlesource.com/c/go/+/399161
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Roland Shoemaker <roland@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
In CreateCertificate, if there are no extensions, don't include the
extensions SEQUENCE in the encoded certificate.
Why, you might ask, does the encoding/asn1 tag 'optional' not do
the same thing as 'omitempty'? Good question, no clue, fixing that
would probably break things in horrific ways.
Fixes#52319
Change-Id: I84fdd5ff3e4e0b0a59e3bf86e7439753b1e1477b
Reviewed-on: https://go-review.googlesource.com/c/go/+/399827
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
This allows memmove and memclr to be invoked using the new
register ABI on riscv64.
Change-Id: I3308d52e06547836cffcc533740fe535624e78d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/361975
Run-TryBot: mzh <mzh@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>