1
0
mirror of https://github.com/golang/go synced 2024-11-23 09:20:05 -07:00
Commit Graph

55012 Commits

Author SHA1 Message Date
qmuntal
3732a17806 cmd/go,internal/platform: enable pie buildmode for windows/arm64
This CL adds windows/arm64 to the list of ports that supports PIE
build mode. It is probably an oversight that this port is not marked
as pie-capable because windows/arm64 only supports PIE build mode.

Fixes #56872

Change-Id: I2bdd3ac207280f47ddcf8c2582f13025dafb9278
Reviewed-on: https://go-review.googlesource.com/c/go/+/452415
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-21 16:49:48 +00:00
Than McIntosh
f0331c524e testing: skip flaky TestRaiseException on windows-amd64-2012-*
Modify skip rule for TestRaiseException to trigger on both the base
builder (windows-amd64-2012) and the newcc canary builder
(windows-amd64-2012-newcc).

Updates #49681.

Change-Id: I132f9ddd102666b68ad04cc661fdcc2cd841051a
Reviewed-on: https://go-review.googlesource.com/c/go/+/451294
Auto-Submit: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2022-11-21 16:27:03 +00:00
Filippo Valsorda
335e7647f5 crypto/internal/bigmod: add amd64 assembly core
With this change, we are down to 1.2x the running time of the previous
variable time implementation.

name                    old time/op  new time/op    delta
DecryptPKCS1v15/2048-4  1.37ms ± 0%    1.61ms ± 0%    +17.54%  (p=0.000 n=18+10)
DecryptPKCS1v15/3072-4  3.99ms ± 1%    5.46ms ± 1%    +36.64%  (p=0.000 n=20+10)
DecryptPKCS1v15/4096-4  8.95ms ± 1%   12.04ms ± 0%    +34.53%  (p=0.000 n=20+10)
EncryptPKCS1v15/2048-4  9.24µs ± 7%  223.34µs ± 0%  +2317.67%  (p=0.000 n=20+9)
DecryptOAEP/2048-4      1.38ms ± 1%    1.62ms ± 0%    +17.31%  (p=0.000 n=20+10)
EncryptOAEP/2048-4      11.5µs ± 6%   225.4µs ± 0%  +1851.82%  (p=0.000 n=20+10)
SignPKCS1v15/2048-4     1.38ms ± 0%    1.68ms ± 0%    +21.25%  (p=0.000 n=20+9)
VerifyPKCS1v15/2048-4   8.75µs ±11%  221.94µs ± 0%  +2435.02%  (p=0.000 n=20+9)
SignPSS/2048-4          1.39ms ± 1%    1.68ms ± 0%    +21.18%  (p=0.000 n=20+10)
VerifyPSS/2048-4        11.1µs ± 8%   224.7µs ± 0%  +1917.03%  (p=0.000 n=20+8)

Change-Id: I2a91ba99fcd0f86f2b5191d17170da755d7c4690
Reviewed-on: https://go-review.googlesource.com/c/go/+/452095
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2022-11-21 16:19:43 +00:00
Filippo Valsorda
08f2091ce0 crypto/ecdsa: use bigmod and nistec instead of math/big and crypto/elliptic
Ignoring custom curves, this makes the whole package constant-time.
There is a slight loss in performance for P-384 and P-521 because bigmod
is slower than math/big (but P-256 has an assembly scalar field
inversion, so doesn't use bigmod for anything big).

name                old time/op    new time/op    delta
Sign/P256-8           19.2µs ± 2%    19.1µs ± 2%     ~     (p=0.268 n=9+10)
Sign/P384-8            166µs ± 3%     188µs ± 2%  +13.52%  (p=0.000 n=10+10)
Sign/P521-8            337µs ± 2%     359µs ± 2%   +6.46%  (p=0.000 n=10+10)
Verify/P256-8         58.1µs ± 2%    58.1µs ± 2%     ~     (p=0.971 n=10+10)
Verify/P384-8          484µs ± 2%     569µs ±12%  +17.65%  (p=0.000 n=10+10)
Verify/P521-8         1.03ms ± 4%    1.14ms ± 2%  +11.02%  (p=0.000 n=10+10)
GenerateKey/P256-8    12.4µs ±12%    12.0µs ± 2%     ~     (p=0.063 n=10+10)
GenerateKey/P384-8     129µs ±18%     119µs ± 2%     ~     (p=0.190 n=10+10)
GenerateKey/P521-8     241µs ± 2%     240µs ± 2%     ~     (p=0.436 n=10+10)

name                old alloc/op   new alloc/op   delta
Sign/P256-8           3.08kB ± 0%    2.47kB ± 0%  -19.77%  (p=0.000 n=10+10)
Sign/P384-8           6.16kB ± 0%    2.64kB ± 0%  -57.16%  (p=0.000 n=10+10)
Sign/P521-8           7.87kB ± 0%    3.01kB ± 0%  -61.80%  (p=0.000 n=10+10)
Verify/P256-8         1.29kB ± 1%    0.48kB ± 0%  -62.69%  (p=0.000 n=10+10)
Verify/P384-8         2.49kB ± 1%    0.64kB ± 0%  -74.25%  (p=0.000 n=10+10)
Verify/P521-8         3.31kB ± 0%    0.96kB ± 0%  -71.02%  (p=0.000 n=7+10)
GenerateKey/P256-8      720B ± 0%      920B ± 0%  +27.78%  (p=0.000 n=10+10)
GenerateKey/P384-8      921B ± 0%     1120B ± 0%  +21.61%  (p=0.000 n=9+10)
GenerateKey/P521-8    1.30kB ± 0%    1.44kB ± 0%  +10.45%  (p=0.000 n=10+10)

name                old allocs/op  new allocs/op  delta
Sign/P256-8             45.0 ± 0%      33.0 ± 0%  -26.67%  (p=0.000 n=10+10)
Sign/P384-8             69.0 ± 0%      34.0 ± 0%  -50.72%  (p=0.000 n=10+10)
Sign/P521-8             71.0 ± 0%      35.0 ± 0%  -50.70%  (p=0.000 n=10+10)
Verify/P256-8           23.0 ± 0%      10.0 ± 0%  -56.52%  (p=0.000 n=10+10)
Verify/P384-8           43.0 ± 0%      14.0 ± 0%  -67.44%  (p=0.000 n=10+10)
Verify/P521-8           45.0 ± 0%      14.0 ± 0%  -68.89%  (p=0.000 n=7+10)
GenerateKey/P256-8      13.0 ± 0%      14.0 ± 0%   +7.69%  (p=0.000 n=10+10)
GenerateKey/P384-8      16.0 ± 0%      17.0 ± 0%   +6.25%  (p=0.000 n=10+10)
GenerateKey/P521-8      16.5 ± 3%      17.0 ± 0%   +3.03%  (p=0.033 n=10+10)

Change-Id: I4e074ef039b0f7ffbc436a4cdbe4ef90c647018d
Reviewed-on: https://go-review.googlesource.com/c/go/+/353849
Auto-Submit: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2022-11-21 16:19:34 +00:00
Filippo Valsorda
d7812ab380 crypto/internal/bigmod: move nat implementation out of crypto/rsa
This will let us reuse it in crypto/ecdsa for the NIST scalar fields.

The main change in API is around encoding and decoding. The SetBytes +
ExpandFor sequence was hacky: SetBytes could produce a bigger size than
the modulus if leading zeroes in the top byte overflowed the limb
boundary, so ExpandFor had to check for and tolerate that. Also, the
caller was responsible for checking that the overflow was actually all
zeroes (which we weren't doing, exposing a crasher in decryption and
signature verification) and then for checking that the result was less
than the modulus. Instead, make SetBytes take a modulus and return an
error if the value overflows. Same with Bytes: we were always allocating
based on Size before FillBytes anyway, so now Bytes takes a modulus.
Finally, SetBig was almost only used for moduli, so replaced
NewModulusFromNat and SetBig with NewModulusFromBig.

Moved the constant-time bitLen to math/big.Int.BitLen. It's slower, but
BitLen is primarily used in cryptographic code, so it's safer this way.

Change-Id: Ibaf7f36d80695578cb80484167d82ce1aa83832f
Reviewed-on: https://go-review.googlesource.com/c/go/+/450055
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2022-11-21 16:19:15 +00:00
Filippo Valsorda
831c6509cc crypto/ed25519: implement Ed25519ctx and Ed25519ph with context
This is missing a test for Ed25519ph with context, since the RFC doesn't
provide one.

Fixes #31804

Change-Id: I20947374c51c6b22fb2835317d00edf816c9a2d2
Reviewed-on: https://go-review.googlesource.com/c/go/+/404274
Auto-Submit: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2022-11-21 15:23:39 +00:00
Paul E. Murphy
8614c525b3 crypto/aes: On ppc64le, use better instructions when available
Several operations emulate instructions available on power9. Use
the GOPPC64_power9 macro provided by the compiler to select the
native instructions if the minimum cpu requirements are met.

Likewise rework the LXSDX_BE to simplify usage when overriding
it. It is only used in one place.

All three configurations are tested via CI.

On POWER9:

pkg:crypto/cipher goos:linux goarch:ppc64le
AESCBCEncrypt1K   949MB/s ± 0%   957MB/s ± 0%  +0.83%
AESCBCDecrypt1K  1.82GB/s ± 0%  1.99GB/s ± 0%  +8.93%
pkg:crypto/aes goos:linux goarch:ppc64le
Encrypt          1.01GB/s ± 0%  1.05GB/s ± 0%  +4.36%
Decrypt           987MB/s ± 0%  1024MB/s ± 0%  +3.77%

Change-Id: I56d0eb845647dd3c43bcad71eb281b499e1d1789
Reviewed-on: https://go-review.googlesource.com/c/go/+/449116
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Auto-Submit: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
2022-11-21 15:06:26 +00:00
Than McIntosh
cf93b25366 cmd/link: revise DLL import symbol handling
This patch reworks the handling of DLL import symbols in the PE host
object loader to ensure that the Go linker can deal with them properly
during internal linking.

Prior to this point the strategy was to immediately treat an import
symbol reference of the form "__imp__XXX" as if it were a reference to
the corresponding DYNIMPORT symbol XXX, except for certain special
cases. This worked for the most part, but ran into problems in
situations where the target ("XXX") wasn't a previously created
DYNIMPORT symbol (and when these problems happened, the root cause was
not always easy to see).

The new strategy is to not do any renaming or forwarding immediately,
but to delay handling until host object loading is complete. At that
point we make a scan through the newly introduced text+data sections
looking at the relocations that target import symbols, forwarding
the references to the corresponding DYNIMPORT sym where appropriate
and where there are direct refs to the DYNIMPORT syms, tagging them
for stub generation later on.

Updates #35006.
Updates #53540.

Change-Id: I2d42b39141ae150a9f82ecc334001749ae8a3b4a
Reviewed-on: https://go-review.googlesource.com/c/go/+/451738
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
2022-11-19 23:11:11 +00:00
Than McIntosh
771a98d6b1 misc/cgo/testcshared: handle unsuffixed dlltool path
Adapt the testcshared tests to handle the case where the path output
by invoking

  gcc -print-prog-name=dlltool

is a path lacking the final ".exe" suffix (this seems to be what clang
is doing); tack it on before using if this is the case.

Updates #35006.
Updates #53540.

Change-Id: I04fb7b9fc90677880b1ced4a4ad2a8867a3f5f86
Reviewed-on: https://go-review.googlesource.com/c/go/+/451816
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-11-19 23:10:07 +00:00
Than McIntosh
bda0235466 cmd/link: add capturehostobjs debugging flag
Add a new debugging flag "-capturehostobjs" that instructs the linker
to capture copies of all object files loaded in during the host object
loading portion of CGO internal linking. The intent is to make it
easier to analyze the objects after the fact (as opposed to having to
dig around inside archives, which can be a "find needle in haystack"
exercise).

Change-Id: I7023a5b72b1b899ea9b3bd6501f069d1f21bbaf0
Reviewed-on: https://go-review.googlesource.com/c/go/+/451737
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-11-19 22:39:44 +00:00
Than McIntosh
8205d83fe2 cmd/link: improved host archive debug trace output
When ctxt.Debugvlog > 1, produce additional trace output to describe
which object files are being pulled out of host archive libraries and
why they were pulled (e.g. which symbol had a reference to something
in a library). Intended to make it easier to debug problems with cgo
internal linking.

Change-Id: Icd64aff244b9145162a00cb51642ef32f26adfba
Reviewed-on: https://go-review.googlesource.com/c/go/+/451736
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-19 22:39:30 +00:00
Filippo Valsorda
58a2db181b crypto/rsa: allocate nats on the stack for RSA 2048
With a small tweak and the help of the inliner, we preallocate enough
nat backing space to do RSA-2048 on the stack.

We keep the length of the preallocated slices at zero so they don't
silently mask missing expandFor calls.

Surprisingly enough, this doesn't move the CPU benchmark needle much,
but probably reduces GC pressure on larger applications.

name                    old time/op    new time/op    delta
DecryptPKCS1v15/2048-8    1.25ms ± 0%    1.22ms ± 1%   -1.68%  (p=0.000 n=10+9)
DecryptPKCS1v15/3072-8    3.78ms ± 0%    3.73ms ± 1%   -1.33%  (p=0.000 n=9+10)
DecryptPKCS1v15/4096-8    8.62ms ± 0%    8.45ms ± 1%   -1.98%  (p=0.000 n=8+10)
EncryptPKCS1v15/2048-8     140µs ± 1%     136µs ± 0%   -2.43%  (p=0.000 n=9+9)
DecryptOAEP/2048-8        1.25ms ± 0%    1.24ms ± 0%   -0.83%  (p=0.000 n=8+10)
EncryptOAEP/2048-8         140µs ± 0%     137µs ± 0%   -1.82%  (p=0.000 n=8+10)
SignPKCS1v15/2048-8       1.29ms ± 0%    1.29ms ± 1%     ~     (p=0.574 n=8+8)
VerifyPKCS1v15/2048-8      139µs ± 0%     136µs ± 0%   -2.12%  (p=0.000 n=9+10)
SignPSS/2048-8            1.30ms ± 0%    1.28ms ± 0%   -0.96%  (p=0.000 n=8+10)
VerifyPSS/2048-8           140µs ± 0%     137µs ± 0%   -1.99%  (p=0.000 n=10+8)

name                    old alloc/op   new alloc/op   delta
DecryptPKCS1v15/2048-8    15.0kB ± 0%     0.5kB ± 0%  -96.58%  (p=0.000 n=10+10)
DecryptPKCS1v15/3072-8    24.6kB ± 0%     3.3kB ± 0%  -86.74%  (p=0.000 n=10+10)
DecryptPKCS1v15/4096-8    38.9kB ± 0%     4.5kB ± 0%  -88.50%  (p=0.000 n=10+10)
EncryptPKCS1v15/2048-8    18.0kB ± 0%     1.2kB ± 0%  -93.48%  (p=0.000 n=10+10)
DecryptOAEP/2048-8        15.2kB ± 0%     0.7kB ± 0%  -95.10%  (p=0.000 n=10+10)
EncryptOAEP/2048-8        18.2kB ± 0%     1.4kB ± 0%  -92.29%  (p=0.000 n=10+10)
SignPKCS1v15/2048-8       21.9kB ± 0%     0.8kB ± 0%  -96.50%  (p=0.000 n=10+10)
VerifyPKCS1v15/2048-8     17.7kB ± 0%     0.9kB ± 0%  -94.85%  (p=0.000 n=10+10)
SignPSS/2048-8            22.3kB ± 0%     1.2kB ± 0%  -94.77%  (p=0.000 n=10+10)
VerifyPSS/2048-8          17.9kB ± 0%     1.1kB ± 0%  -93.75%  (p=0.000 n=10+10)

name                    old allocs/op  new allocs/op  delta
DecryptPKCS1v15/2048-8       124 ± 0%         3 ± 0%  -97.58%  (p=0.000 n=10+10)
DecryptPKCS1v15/3072-8       140 ± 0%         9 ± 0%  -93.57%  (p=0.000 n=10+10)
DecryptPKCS1v15/4096-8       158 ± 0%         9 ± 0%  -94.30%  (p=0.000 n=10+10)
EncryptPKCS1v15/2048-8      80.0 ± 0%       7.0 ± 0%  -91.25%  (p=0.000 n=10+10)
DecryptOAEP/2048-8           130 ± 0%         9 ± 0%  -93.08%  (p=0.000 n=10+10)
EncryptOAEP/2048-8          86.0 ± 0%      13.0 ± 0%  -84.88%  (p=0.000 n=10+10)
SignPKCS1v15/2048-8          162 ± 0%         4 ± 0%  -97.53%  (p=0.000 n=10+10)
VerifyPKCS1v15/2048-8       79.0 ± 0%       6.0 ± 0%  -92.41%  (p=0.000 n=10+10)
SignPSS/2048-8               167 ± 0%         9 ± 0%  -94.61%  (p=0.000 n=10+10)
VerifyPSS/2048-8            84.0 ± 0%      11.0 ± 0%  -86.90%  (p=0.000 n=10+10)

Change-Id: I511a2f5f6f596bbec68a0a411e83a9d04080d72a
Reviewed-on: https://go-review.googlesource.com/c/go/+/445021
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-19 16:50:07 +00:00
Filippo Valsorda
72d2c4c635 crypto/rsa: use R*R multiplication to get into the Montgomery domain
This is faster than the current code because computing RR involves
one more shiftIn and using it involves an extra multiplication, but each
exponentiation was doing montgomeryRepresentation twice, once for x and
once for 1, and now they share the RR precomputation.

More importantly, it allows precomputing the value and attaching it to
the private key in a future CL.

name                    old time/op  new time/op  delta
DecryptPKCS1v15/2048-8  1.46ms ± 0%  1.40ms ± 7%   -3.69%  (p=0.003 n=10+9)
DecryptPKCS1v15/3072-8  4.23ms ± 0%  4.13ms ± 4%   -2.36%  (p=0.004 n=9+9)
DecryptPKCS1v15/4096-8  9.42ms ± 0%  9.08ms ± 3%   -3.69%  (p=0.000 n=9+10)
EncryptPKCS1v15/2048-8   221µs ± 0%   137µs ± 1%  -37.91%  (p=0.000 n=9+10)
DecryptOAEP/2048-8      1.46ms ± 0%  1.39ms ± 1%   -4.97%  (p=0.000 n=9+10)
EncryptOAEP/2048-8       221µs ± 0%   138µs ± 0%  -37.71%  (p=0.000 n=8+10)
SignPKCS1v15/2048-8     1.68ms ± 0%  1.53ms ± 1%   -8.85%  (p=0.000 n=9+10)
VerifyPKCS1v15/2048-8    220µs ± 0%   137µs ± 1%  -37.84%  (p=0.000 n=9+10)
SignPSS/2048-8          1.68ms ± 0%  1.52ms ± 1%   -9.16%  (p=0.000 n=8+8)
VerifyPSS/2048-8         234µs ±12%   138µs ± 1%  -40.87%  (p=0.000 n=10+9)

Change-Id: I6c650bad9019765d793fd37a529ca186cf1eeef7
Reviewed-on: https://go-review.googlesource.com/c/go/+/445019
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
2022-11-19 16:49:53 +00:00
Filippo Valsorda
5aa6313e58 crypto/rsa: precompute moduli
This change adds some private fields to PrecomputedValues.

If applications were for some reason manually computing the
PrecomputedValues, which they can't do anymore, things will still work
but revert back to the unoptimized path.

name                    old time/op  new time/op  delta
DecryptPKCS1v15/2048-8  1.40ms ± 0%  1.24ms ± 0%  -10.98%  (p=0.000 n=10+8)
DecryptPKCS1v15/3072-8  4.14ms ± 0%  3.78ms ± 1%   -8.55%  (p=0.000 n=10+10)
DecryptPKCS1v15/4096-8  9.09ms ± 0%  8.62ms ± 0%   -5.20%  (p=0.000 n=9+8)
EncryptPKCS1v15/2048-8   139µs ± 0%   138µs ± 0%     ~     (p=0.436 n=9+9)
DecryptOAEP/2048-8      1.40ms ± 0%  1.25ms ± 0%  -11.01%  (p=0.000 n=9+9)
EncryptOAEP/2048-8       139µs ± 0%   139µs ± 0%     ~     (p=0.315 n=10+10)
SignPKCS1v15/2048-8     1.53ms ± 0%  1.29ms ± 0%  -15.93%  (p=0.000 n=9+10)
VerifyPKCS1v15/2048-8    138µs ± 0%   138µs ± 0%     ~     (p=0.052 n=10+10)
SignPSS/2048-8          1.54ms ± 0%  1.29ms ± 0%  -15.89%  (p=0.000 n=9+9)
VerifyPSS/2048-8         139µs ± 0%   139µs ± 0%     ~     (p=0.442 n=8+8)

Change-Id: I843c468db96aa75b18ddff17cec3eadfb579cd0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/445020
Reviewed-by: Joedian Reid <joedian@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2022-11-19 16:48:51 +00:00
Filippo Valsorda
ee5ccc9d4a crypto/rsa: deprecate and de-optimize multi-prime RSA
I have never encountered multi-prime RSA in the wild. A GitHub-wide
search reveals exactly two explicit uses of it (and a couple of tools
that leave the number configurable but defaulting to two).

https://github.com/decred/tumblebit/blob/31898baea/puzzle/puzzlekey.go#L38
https://github.com/carl-mastrangelo/pixur/blob/95d4a4208/tools/genkeys/genkeys.go#L13

Multi-prime RSA has a slight performance advantage, but has limited
compatibility and the number of primes must be chosen carefully based on
the key size to avoid security issues. It also requires a completely
separate and rarely used private key operation code path, which if buggy
or incorrect would leak the private key.

Mark it as deprecated, and remove the dedicated CRT optimization,
falling back instead to the slower but safer non-CRT fallback.

Change-Id: Iba95edc044fcf9b37bc1f4bb59c6ea273975837f
Reviewed-on: https://go-review.googlesource.com/c/go/+/445017
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
2022-11-19 16:48:39 +00:00
Lúcás Meier
8a81fdf165 crypto/rsa: replace big.Int for encryption and decryption
Infamously, big.Int does not provide constant-time arithmetic, making
its use in cryptographic code quite tricky. RSA uses big.Int
pervasively, in its public API, for key generation, precomputation, and
for encryption and decryption. This is a known problem. One mitigation,
blinding, is already in place during decryption. This helps mitigate the
very leaky exponentiation operation. Because big.Int is fundamentally
not constant-time, it's unfortunately difficult to guarantee that
mitigations like these are completely effective.

This patch removes the use of big.Int for encryption and decryption,
replacing it with an internal nat type instead. Signing and verification
are also affected, because they depend on encryption and decryption.

Overall, this patch degrades performance by 55% for private key
operations, and 4-5x for (much faster) public key operations.
(Signatures do both, so the slowdown is worse than decryption.)

name                    old time/op  new time/op    delta
DecryptPKCS1v15/2048-8  1.50ms ± 0%    2.34ms ± 0%    +56.44%  (p=0.000 n=8+10)
DecryptPKCS1v15/3072-8  4.40ms ± 0%    6.79ms ± 0%    +54.33%  (p=0.000 n=10+9)
DecryptPKCS1v15/4096-8  9.31ms ± 0%   15.14ms ± 0%    +62.60%  (p=0.000 n=10+10)
EncryptPKCS1v15/2048-8  8.16µs ± 0%  355.58µs ± 0%  +4258.90%  (p=0.000 n=10+9)
DecryptOAEP/2048-8      1.50ms ± 0%    2.34ms ± 0%    +55.68%  (p=0.000 n=10+9)
EncryptOAEP/2048-8      8.51µs ± 0%  355.95µs ± 0%  +4082.75%  (p=0.000 n=10+9)
SignPKCS1v15/2048-8     1.51ms ± 0%    2.69ms ± 0%    +77.94%  (p=0.000 n=10+10)
VerifyPKCS1v15/2048-8   7.25µs ± 0%  354.34µs ± 0%  +4789.52%  (p=0.000 n=9+9)
SignPSS/2048-8          1.51ms ± 0%    2.70ms ± 0%    +78.80%  (p=0.000 n=9+10)
VerifyPSS/2048-8        8.27µs ± 1%  355.65µs ± 0%  +4199.39%  (p=0.000 n=10+10)

Keep in mind that this is without any assembly at all, and that further
improvements are likely possible. I think having a review of the logic
and the cryptography would be a good idea at this stage, before we
complicate the code too much through optimization.

The bulk of the work is in nat.go. This introduces two new types: nat,
representing natural numbers, and modulus, representing moduli used in
modular arithmetic.

A nat has an "announced size", which may be larger than its "true size",
the number of bits needed to represent this number. Operations on a nat
will only ever leak its announced size, never its true size, or other
information about its value. The size of a nat is always clear based on
how its value is set. For example, x.mod(y, m) will make the announced
size of x match that of m, since x is reduced modulo m.

Operations assume that the announced size of the operands match what's
expected (with a few exceptions). For example, x.modAdd(y, m) assumes
that x and y have the same announced size as m, and that they're reduced
modulo m.

Nats are represented over unsatured bits.UintSize - 1 bit limbs. This
means that we can't reuse the assembly routines for big.Int, which use
saturated bits.UintSize limbs. The advantage of unsaturated limbs is
that it makes Montgomery multiplication faster, by needing fewer
registers in a hot loop. This makes exponentiation faster, which
consists of many Montgomery multiplications.

Moduli use nat internally. Unlike nat, the true size of a modulus always
matches its announced size. When creating a modulus, any zero padding is
removed. Moduli will also precompute constants when created, which is
another reason why having a separate type is desirable.

Updates #20654

Co-authored-by: Filippo Valsorda <filippo@golang.org>
Change-Id: I73b61f87d58ab912e80a9644e255d552cbadcced
Reviewed-on: https://go-review.googlesource.com/c/go/+/326012
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Joedian Reid <joedian@golang.org>
2022-11-19 16:48:07 +00:00
Filippo Valsorda
5f60f844be crypto/ecdsa,crypto/x509: add encoding paths for NIST crypto/ecdh keys
Fixes #56088
Updates #52221

Change-Id: Id2f806a116100a160be7daafc3e4c0be2acdd6a9
Reviewed-on: https://go-review.googlesource.com/c/go/+/450816
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2022-11-19 16:45:10 +00:00
Joel Sing
e84ce0802d runtime: change tfork behaviour to unbreak openbsd/mips64
Currently, tfork on openbsd/mips64 returns the thread ID on success and
a negative error number on error. In CL#447175, newosproc was changed
to assume that a non-zero value is an error - return zero on success to
match this expectation.

Change-Id: I955efad49b149146165eba3d05fe40ba75caa098
Reviewed-on: https://go-review.googlesource.com/c/go/+/451257
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
2022-11-19 03:33:26 +00:00
Damien Neil
f4f8397fed net/http: deflake TestIssue4191_InfiniteGetTimeout
This test exercises the case where a net.Conn error occurs while
writing a response body. It injects an error by setting a timeout
on the Conn. If this timeout expires before response headers are
written, the test fails. The test attempts to recover from this
failure by extending the timeout and retrying.

Set the timeout after the response headers are removed, and
remove the retry loop.

Fixes #56274.

Change-Id: I293f8bedb7b20a21d14f43ea9bb48fc56b59441c
Reviewed-on: https://go-review.googlesource.com/c/go/+/452175
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
2022-11-19 01:19:55 +00:00
Damien Neil
c6cdfd88c7 net/http: direct server logs to test output in tests
Set a logger in newClientServerTest that directs the server
log output to the testing.T's log, so log output gets properly
associated with the test that caused it.

Change-Id: I13686ca35c3e21adae16b2fc37ce36daea3df9d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/452075
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
2022-11-19 01:19:45 +00:00
Dmitri Shuralyov
4f0d3bcd6d net/http: regenerate h2_bundle.go
Done with:

	go generate -run=bundle std

After CL 452096 updated the x/net version.

Change-Id: I1c1cd76d4ec9e14f45dc66c945c74e41ff689a30
Reviewed-on: https://go-review.googlesource.com/c/go/+/452195
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-11-18 23:57:13 +00:00
Roland Shoemaker
04d6aa6514 crypto/x509: implement SetFallbackRoots
Adds a method which allows users to set a fallback certificate pool for
usage during verification if the system certificate pool is empty.

Updates #43958

Change-Id: I279dd2f753743bce19790f2ae29f063c89c9359d
Reviewed-on: https://go-review.googlesource.com/c/go/+/449235
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
2022-11-18 23:57:10 +00:00
Filippo Valsorda
c8244489cc all: update golang.org/x/crypto to 2c476679df9a
To pick up CL 451515.

This CL also updates x/net because x/crypto's dependency was bumped
while tagging v0.3.0.

Done by
        go get -d golang.org/x/crypto@2c476679df9a
        go mod tidy
        go mod vendor

Change-Id: I432a04586be3784b1027aa9b62d86c0df6d4a97e
Reviewed-on: https://go-review.googlesource.com/c/go/+/452096
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 22:32:44 +00:00
David Chase
ea2c27fe82 cmd/compile: package-annotate structs when error would be ambiguous
Before emitting a "wanted Foo but got Bar" message for an interface
type match failure, check that Foo and Bar are different.  If they
are not, add package paths to first unexported struct field seen,
because that is the cause (a cause, there could be more than one).

Replicated in go/types.

Added tests to go/types and cmd/compile/internal/types2

Fixes #54258.

Change-Id: Ifc2b2067d62fe2138996972cdf3b6cb7ca0ed456
Reviewed-on: https://go-review.googlesource.com/c/go/+/422914
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2022-11-18 21:48:06 +00:00
Michael Matloob
dccc58e1b9 cmd/go: don't report non-go files in CompiledGoFiles
We save non-go files in the cached srcfiles file because we want the
non-go files for vet, but we shouldn't report them in CompiledGoFiles.
Filter them out before adding them to CompiledGoFiles.

Fixes #28749

Change-Id: I889d4bbf8c4ec1348584a62ef5e4f8b3f05e97da
Reviewed-on: https://go-review.googlesource.com/c/go/+/451285
Run-TryBot: Michael Matloob <matloob@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
2022-11-18 21:12:24 +00:00
Michael Matloob
7161fc737d cmd/go/internal/script: check lack of error for non-waiting cmds
In the script engine, if a command does not return a Wait function and
it succeeds, we won't call checkStatus. That means that commands that
don't have a wait function, have a "!" indicating that they are
supposed to fail, and then succeed will spuriously not fail the script
engine test even they were supposed to fail but didn't.

Change-Id: Ic88c3cdd628064d48f14a8a4a2e97cded48890fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/451284
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Matloob <matloob@golang.org>
2022-11-18 21:02:24 +00:00
Damien Neil
6fc1f4f906 doc/go1.20: add release notes for net package
For #50101
For #51152
For #53482
For #55301
For #56515

Change-Id: I11edeb4be0a7f80fb72fd7680a3407d081f83b8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/451420
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-18 20:22:20 +00:00
Damien Neil
f263d9cd93 net: fix typo in ControlContext parameter names
Change-Id: I35fcfb2d8cafadca36cffeebe0858973895946d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/451419
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
2022-11-18 19:13:54 +00:00
Gabor Tanz
f64c2a2ce5 crypto/tls: add CertificateVerificationError to tls handshake
Fixes #48152

Change-Id: I503f088edeb5574fd5eb5905bff7c3c23b2bc8fc
GitHub-Last-Rev: 2b0e982f3f
GitHub-Pull-Request: golang/go#56686
Reviewed-on: https://go-review.googlesource.com/c/go/+/449336
Run-TryBot: Roland Shoemaker <roland@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Julie Qiu <julieqiu@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2022-11-18 18:50:57 +00:00
Michael Matloob
fd00c14bf1 cmd/go: replace 'directory .' with 'current directory' in some errors
To make the error clearer

Fixes #56697

Change-Id: Idfb5e8704d1bfc64bd0a09d5b553086d9ba5ac33
Reviewed-on: https://go-review.googlesource.com/c/go/+/451295
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
2022-11-18 18:09:53 +00:00
cui fliter
b2faff18ce all: add missing periods in comments
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/449757
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joedian Reid <joedian@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-18 17:59:44 +00:00
Keith Randall
893964b972 runtime,cmd/link: increase stack guard space when building with -race
More stuff to do = more stack needed. Bump up the guard space when
building with the race detector.

Fixes #54291

Change-Id: I701bc8800507921bed568047d35b8f49c26e7df7
Reviewed-on: https://go-review.googlesource.com/c/go/+/451217
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-11-18 16:26:25 +00:00
Joel Sing
e18d07ddc5 runtime: optimise memmove on riscv64
Implement a more optimised memmove on riscv64, where up to 64 bytes are moved
per loop after achieving alignment. In the unaligned case, memory is moved at
up to 8 bytes per loop.

This also avoids doing unaligned loads and stores, which results in kernel
traps and a significant performance penality.

Fixes #48248.

name                               old speed      new speed        delta
Memmove/1-4                        31.3MB/s _ 0%    26.6MB/s _ 0%    -14.95%  (p=0.000 n=3+3)
Memmove/2-4                        50.6MB/s _ 1%    42.6MB/s _ 0%    -15.75%  (p=0.000 n=3+3)
Memmove/3-4                        64.5MB/s _ 1%    53.4MB/s _ 2%    -17.11%  (p=0.001 n=3+3)
Memmove/4-4                        74.9MB/s _ 0%    99.2MB/s _ 0%    +32.55%  (p=0.000 n=3+3)
Memmove/5-4                        82.3MB/s _ 0%    99.0MB/s _ 1%    +20.29%  (p=0.000 n=3+3)
Memmove/6-4                        88.2MB/s _ 0%   102.3MB/s _ 1%    +15.87%  (p=0.000 n=3+3)
Memmove/7-4                        93.4MB/s _ 0%   102.0MB/s _ 0%     +9.18%  (p=0.000 n=3+3)
Memmove/8-4                         188MB/s _ 3%     188MB/s _ 6%       ~     (p=0.964 n=3+3)
Memmove/9-4                         182MB/s _ 6%     163MB/s _ 1%       ~     (p=0.069 n=3+3)
Memmove/10-4                        177MB/s _ 0%     149MB/s _ 4%    -15.93%  (p=0.012 n=3+3)
Memmove/11-4                        171MB/s _ 6%     148MB/s _ 0%    -13.65%  (p=0.045 n=3+3)
Memmove/12-4                        166MB/s _ 5%     209MB/s _ 0%    +26.12%  (p=0.009 n=3+3)
Memmove/13-4                        170MB/s _ 1%     188MB/s _ 4%    +10.76%  (p=0.039 n=3+3)
Memmove/14-4                        158MB/s _ 0%     185MB/s _ 0%    +17.13%  (p=0.000 n=3+3)
Memmove/15-4                        166MB/s _ 0%     175MB/s _ 0%     +5.38%  (p=0.000 n=3+3)
Memmove/16-4                        320MB/s _ 6%     343MB/s _ 0%       ~     (p=0.149 n=3+3)
Memmove/32-4                        493MB/s _ 5%     628MB/s _ 1%    +27.51%  (p=0.008 n=3+3)
Memmove/64-4                        706MB/s _ 0%    1132MB/s _ 0%    +60.32%  (p=0.000 n=3+3)
Memmove/128-4                       837MB/s _ 1%    1623MB/s _ 1%    +93.96%  (p=0.000 n=3+3)
Memmove/256-4                       960MB/s _ 0%    2070MB/s _ 6%   +115.68%  (p=0.003 n=3+3)
Memmove/512-4                      1.04GB/s _ 0%    2.55GB/s _ 0%   +146.05%  (p=0.000 n=3+3)
Memmove/1024-4                     1.08GB/s _ 0%    2.76GB/s _ 0%   +155.62%  (p=0.000 n=3+3)
Memmove/2048-4                     1.10GB/s _ 0%    2.90GB/s _ 1%   +164.31%  (p=0.000 n=3+3)
Memmove/4096-4                     1.11GB/s _ 0%    2.98GB/s _ 0%   +169.77%  (p=0.000 n=3+3)
MemmoveOverlap/32-4                 443MB/s _ 0%     500MB/s _ 0%    +12.81%  (p=0.000 n=3+3)
MemmoveOverlap/64-4                 635MB/s _ 0%     908MB/s _ 0%    +42.92%  (p=0.000 n=3+3)
MemmoveOverlap/128-4                789MB/s _ 0%    1423MB/s _ 0%    +80.28%  (p=0.000 n=3+3)
MemmoveOverlap/256-4                925MB/s _ 0%    1941MB/s _ 0%   +109.86%  (p=0.000 n=3+3)
MemmoveOverlap/512-4               1.01GB/s _ 2%    2.37GB/s _ 0%   +134.86%  (p=0.000 n=3+3)
MemmoveOverlap/1024-4              1.06GB/s _ 0%    2.68GB/s _ 1%   +151.67%  (p=0.000 n=3+3)
MemmoveOverlap/2048-4              1.09GB/s _ 0%    2.89GB/s _ 0%   +164.82%  (p=0.000 n=3+3)
MemmoveOverlap/4096-4              1.11GB/s _ 0%    3.01GB/s _ 0%   +171.30%  (p=0.000 n=3+3)
MemmoveUnalignedDst/1-4            24.1MB/s _ 1%    21.3MB/s _ 0%    -11.76%  (p=0.000 n=3+3)
MemmoveUnalignedDst/2-4            41.6MB/s _ 1%    35.9MB/s _ 0%    -13.72%  (p=0.000 n=3+3)
MemmoveUnalignedDst/3-4            54.0MB/s _ 0%    45.5MB/s _ 2%    -15.76%  (p=0.004 n=3+3)
MemmoveUnalignedDst/4-4            63.9MB/s _ 1%    81.6MB/s _ 0%    +27.70%  (p=0.000 n=3+3)
MemmoveUnalignedDst/5-4            69.4MB/s _ 6%    84.8MB/s _ 0%    +22.08%  (p=0.015 n=3+3)
MemmoveUnalignedDst/6-4            77.8MB/s _ 2%    89.0MB/s _ 0%    +14.53%  (p=0.004 n=3+3)
MemmoveUnalignedDst/7-4            83.0MB/s _ 0%    90.7MB/s _ 1%     +9.30%  (p=0.000 n=3+3)
MemmoveUnalignedDst/8-4            6.97MB/s _ 2%  127.73MB/s _ 0%  +1732.57%  (p=0.000 n=3+3)
MemmoveUnalignedDst/9-4            7.81MB/s _ 1%  125.41MB/s _ 0%  +1506.45%  (p=0.000 n=3+3)
MemmoveUnalignedDst/10-4           8.59MB/s _ 2%  123.52MB/s _ 0%  +1337.43%  (p=0.000 n=3+3)
MemmoveUnalignedDst/11-4           9.23MB/s _ 6%  119.81MB/s _ 4%  +1197.55%  (p=0.000 n=3+3)
MemmoveUnalignedDst/12-4           10.3MB/s _ 0%   155.9MB/s _ 7%  +1416.08%  (p=0.001 n=3+3)
MemmoveUnalignedDst/13-4           10.9MB/s _ 3%   155.1MB/s _ 0%  +1321.26%  (p=0.000 n=3+3)
MemmoveUnalignedDst/14-4           11.4MB/s _ 5%   151.0MB/s _ 0%  +1229.37%  (p=0.000 n=3+3)
MemmoveUnalignedDst/15-4           12.6MB/s _ 0%   147.0MB/s _ 0%  +1066.39%  (p=0.000 n=3+3)
MemmoveUnalignedDst/16-4           7.17MB/s _ 0%  184.33MB/s _ 5%  +2470.90%  (p=0.001 n=3+3)
MemmoveUnalignedDst/32-4           7.26MB/s _ 0%  252.00MB/s _ 2%  +3371.12%  (p=0.000 n=3+3)
MemmoveUnalignedDst/64-4           7.25MB/s _ 2%  306.37MB/s _ 1%  +4125.75%  (p=0.000 n=3+3)
MemmoveUnalignedDst/128-4          7.32MB/s _ 1%  338.03MB/s _ 1%  +4517.85%  (p=0.000 n=3+3)
MemmoveUnalignedDst/256-4          7.31MB/s _ 0%  361.06MB/s _ 0%  +4841.47%  (p=0.000 n=3+3)
MemmoveUnalignedDst/512-4          7.35MB/s _ 0%  373.55MB/s _ 0%  +4982.36%  (p=0.000 n=3+3)
MemmoveUnalignedDst/1024-4         7.33MB/s _ 0%  379.00MB/s _ 2%  +5068.18%  (p=0.000 n=3+3)
MemmoveUnalignedDst/2048-4         7.31MB/s _ 2%  383.05MB/s _ 0%  +5142.47%  (p=0.000 n=3+3)
MemmoveUnalignedDst/4096-4         7.35MB/s _ 1%  385.97MB/s _ 1%  +5151.25%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/32-4    9.43MB/s _ 0%  233.72MB/s _ 0%  +2377.56%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/64-4    8.13MB/s _ 3%  288.77MB/s _ 0%  +3451.91%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/128-4   7.77MB/s _ 0%  326.62MB/s _ 3%  +4103.65%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/256-4   7.28MB/s _ 6%  357.24MB/s _ 0%  +4804.85%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/512-4   7.44MB/s _ 0%  363.63MB/s _ 7%  +4787.54%  (p=0.001 n=3+3)
MemmoveUnalignedDstOverlap/1024-4  7.37MB/s _ 0%  383.17MB/s _ 0%  +5101.40%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/2048-4  7.29MB/s _ 2%  387.69MB/s _ 0%  +5215.68%  (p=0.000 n=3+3)
MemmoveUnalignedDstOverlap/4096-4  7.18MB/s _ 5%  389.22MB/s _ 0%  +5320.84%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/1-4            24.2MB/s _ 0%    21.4MB/s _ 1%    -11.70%  (p=0.001 n=3+3)
MemmoveUnalignedSrc/2-4            41.7MB/s _ 0%    36.0MB/s _ 0%    -13.71%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/3-4            52.1MB/s _ 6%    46.4MB/s _ 1%       ~     (p=0.074 n=3+3)
MemmoveUnalignedSrc/4-4            60.4MB/s _ 0%    76.4MB/s _ 0%    +26.39%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/5-4            71.2MB/s _ 1%    84.7MB/s _ 0%    +18.90%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/6-4            77.7MB/s _ 0%    88.7MB/s _ 0%    +14.06%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/7-4            82.9MB/s _ 1%    90.7MB/s _ 1%     +9.42%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/8-4            74.6MB/s _ 0%   120.6MB/s _ 0%    +61.62%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/9-4            78.7MB/s _ 1%   123.9MB/s _ 1%    +57.42%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/10-4           82.1MB/s _ 0%   121.7MB/s _ 0%    +48.21%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/11-4           83.7MB/s _ 5%   122.0MB/s _ 0%    +45.79%  (p=0.003 n=3+3)
MemmoveUnalignedSrc/12-4           88.6MB/s _ 0%   160.8MB/s _ 0%    +81.56%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/13-4           91.0MB/s _ 0%   155.0MB/s _ 0%    +70.29%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/14-4           92.0MB/s _ 2%   151.0MB/s _ 0%    +64.09%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/15-4           12.6MB/s _ 0%   146.6MB/s _ 0%  +1063.32%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/16-4           13.3MB/s _ 0%   188.8MB/s _ 2%  +1319.02%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/32-4           9.44MB/s _ 0%  254.24MB/s _ 1%  +2594.21%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/64-4           8.27MB/s _ 0%  302.33MB/s _ 2%  +3555.78%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/128-4          7.73MB/s _ 3%  338.82MB/s _ 0%  +4281.29%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/256-4          7.58MB/s _ 0%  362.19MB/s _ 0%  +4678.23%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/512-4          7.44MB/s _ 1%  374.49MB/s _ 0%  +4933.51%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/1024-4         7.30MB/s _ 2%  379.74MB/s _ 0%  +5099.54%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/2048-4         7.34MB/s _ 2%  385.50MB/s _ 0%  +5154.38%  (p=0.000 n=3+3)
MemmoveUnalignedSrc/4096-4         7.35MB/s _ 1%  383.64MB/s _ 0%  +5119.59%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/32-4    7.22MB/s _ 0%  254.94MB/s _ 0%  +3432.66%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/64-4    7.29MB/s _ 1%  296.99MB/s _ 5%  +3973.89%  (p=0.001 n=3+3)
MemmoveUnalignedSrcOverlap/128-4   7.32MB/s _ 1%  336.73MB/s _ 1%  +4500.09%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/256-4   7.30MB/s _ 1%  361.41MB/s _ 0%  +4850.82%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/512-4   7.34MB/s _ 0%  374.92MB/s _ 0%  +5007.90%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/1024-4  7.34MB/s _ 0%  380.15MB/s _ 0%  +5079.16%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/2048-4  7.36MB/s _ 0%  383.78MB/s _ 0%  +5116.76%  (p=0.000 n=3+3)
MemmoveUnalignedSrcOverlap/4096-4  7.35MB/s _ 0%  386.32MB/s _ 0%  +5156.05%  (p=0.000 n=3+3)

Change-Id: Ibc13230af7b1e205ed95a6470e2cf64ff4251405
Reviewed-on: https://go-review.googlesource.com/c/go/+/426256
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
2022-11-18 15:33:16 +00:00
Tobias Klauser
c13ce2985c io/fs: clean up test helper functions
Inline the only use of checkMarks which also allows to drop the
always-true report argument. This also ensures the correct line gets
reported in case of an error.

Also remove the unused markTree function and drop the unused testing.T
argument from makeTree.

Change-Id: I4033d3e5ecd929d08ce03c563aa99444e102d931
Reviewed-on: https://go-review.googlesource.com/c/go/+/451615
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 15:21:18 +00:00
Paul E. Murphy
1ed636dc97 cmd/link/internal/ppc64: fix trampoline reuse distance calculation
If a compatible trampoline has been inserted by a previously laid
function in the same section, and is known to be sufficiently close,
it can be reused.

When testing if the trampoline can be reused, the addend of the direct
call should be ignored. It is already encoded in the trampoline. If the
addend is non-zero, and the target sufficiently far away, and just
beyond direct call reach, this may cause the trampoline to be
incorrectly reused.

This was observed on go1.17.13 and openshift-installer commit f3c53b382
building in release mode with the following error:

github.com/aliyun/alibaba-cloud-sdk-go/services/cms.(*Client).DescribeMonitoringAgentAccessKeyWithChan.func1: direct call too far: runtime.duffzero+1f0-tramp0-1 -2000078

Fixes #56775

Change-Id: I54af957302506d4e3cd5d3121542c83fe980e912
Reviewed-on: https://go-review.googlesource.com/c/go/+/451415
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2022-11-18 14:31:23 +00:00
Tobias Klauser
349d398ea3 cmd/compile/internal/base, cmd/internal/bio: use syscall.Mmap on aix
Change-Id: Ic28612952eb9abf14425f0bb14043b10f6050d94
Reviewed-on: https://go-review.googlesource.com/c/go/+/450195
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-11-18 14:08:47 +00:00
Wayne Zuo
8893da7c72 cmd/compile: fix wrong optimization for eliding Not in Phi
The previous rule may move the phi value into a wrong block.
This CL make it only rewrite the phi value not the If block,
so that the phi value will stay in old block.

Fixes #56777

Change-Id: I9479a5c7f28529786968413d35b82a16181bb1f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/451496
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2022-11-18 13:26:33 +00:00
eric fang
205f636e0a cmd/internal/obj/arm64: tidy literal pool
This CL cleans up the literal pool implementation and inserts an UNDEF
instruction before the literal pool if the last instruction of the
function is not an unconditional jump instruction, RET or ERET
instruction.

Change-Id: Ifecb9e3372478362dde246c1bc9bc8d527a469d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/424134
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 08:04:52 +00:00
Michael Knyszek
e4435cb844 runtime: add page tracer
This change adds a new GODEBUG flag called pagetrace that writes a
low-overhead trace of how pages of memory are managed by the Go runtime.

The page tracer is kept behind a GOEXPERIMENT flag due to a potential
security risk for setuid binaries.

Change-Id: I6f4a2447d02693c25214400846a5d2832ad6e5c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/444157
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 03:45:30 +00:00
eric fang
0613418c98 cmd/internal/obj/arm64: mark branch instructions in optab
Currently, we judge whether we need to fix up the branch instruction
based on Optab.type_ field, but the type_ field in optab may change.
This CL marks the branch instruction in optab, and checks whether to
do fixing up according to the mark. Depending on the constant parameter
range of the branch instruction, there are two labels, BRANCH14BITS,
BRANCH19BITS. For the 26-bit branch, linker will handle it.

Besides this CL removes the unnecessary alignment of the DWORD
instruction. Because the ISA doesn't require it and no 64-bit load
assume it. The only effect is that there is some performance penalty
for loading from DWORDs if the 8-byte DWORD instruction crosses the
cache line, but this is very rare.

Change-Id: I993902b3fb5ad8e081dd6c441e86bcf581031835
Reviewed-on: https://go-review.googlesource.com/c/go/+/424135
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
2022-11-18 02:33:33 +00:00
Robert Findley
0789ca4951 go/types, types2: ensure signatures are instantiated if all type args
are provided

Improve the accuracy of recorded types and instances for function calls,
by instantiating their signature before checking arguments if all type
arguments are provided. This avoids a problem where fully instantiated
function signatures are are not recorded as such following an error
checking their arguments.

Fixes golang/go#51803

Change-Id: Iec4cbd219a2cd19bb1bcf2a5c4019f556e4304b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/451436
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 01:44:51 +00:00
Robert Griesemer
7f75b72904 go/types, types2: replace some Errorf calls with Error calls (cleanup)
Change-Id: I9b6759a82b8009b323132c78cb7d78c2c35652bd
Reviewed-on: https://go-review.googlesource.com/c/go/+/451815
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 00:36:15 +00:00
Robert Griesemer
a8f9d3f0af go/types, types2: replace (internal) writePackage with packagePrefix
This makes it easier to use the package string prefix in some cases
(cleanup).

Change-Id: I0ae74bf8770999110e7d6e49eac4e42962e78596
Reviewed-on: https://go-review.googlesource.com/c/go/+/451795
Auto-Submit: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 00:16:52 +00:00
Filippo Valsorda
4571164537 crypto/ecdsa: improve benchmarks
While at it, drop P-224 benchmarks. Nobody cares about P-224.

Change-Id: I31db6fedde6026deff36de963690275dacf5fda1
Reviewed-on: https://go-review.googlesource.com/c/go/+/451196
Auto-Submit: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Joedian Reid <joedian@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
2022-11-18 00:08:48 +00:00
Keith Randall
d6171c9be2 runtime: fix conflict between lfstack and checkptr
lfstack does very unsafe things. In particular, it will not
work with nodes that live on the heap. In normal use by the runtime,
that is the case (it is only used for gc work bufs). But the lfstack
test does use heap objects. It goes through some hoops to prevent
premature deallocation, but those hoops are not enough to convince
-d=checkptr that everything is ok.

Instead, allocate the test objects outside the heap, like the runtime
does for all of its lfstack usage. Remove the lifetime workaround
from the test.

Reported in https://groups.google.com/g/golang-nuts/c/psjrUV2ZKyI

Change-Id: If611105eab6c823a4d6c105938ce145ed731781d
Reviewed-on: https://go-review.googlesource.com/c/go/+/448899
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2022-11-17 23:12:04 +00:00
Mateusz Poliwczak
3e5c2c1556 net: return errNoSuchHost when no entry found in /etc/hosts and order is hostLookupFiles
When /etc/nsswitch.conf lists: "hosts: files" then LookupHost returns two nils when no entry inside /etc/hosts is found.

Change-Id: I96d68a079dfe009655c84cf0e697ce19a5bb6698
GitHub-Last-Rev: 894f066bbc
GitHub-Pull-Request: golang/go#56747
Reviewed-on: https://go-review.googlesource.com/c/go/+/450875
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-17 21:42:39 +00:00
Russ Cox
1f4394a0c9 runtime: work around Apple libc bugs to make exec stop hanging
For a while now, we've had intermittent reports about problems with
os/exec on macOS, but no clear way to reproduce them. Recent changes
in the os/exec package test seem to have aligned the stars just right,
at least on my two x86 and ARM MacBook Pro laptops, to make the
package test hang with roughly 50% probability. When it does hang, the
stacks I see in the hung process match the ones reported for the
Go-based hangs in #33565. (They do not match the ones reported in the
so-called C reproducer in that issue, but I think that reproducer is
actually reproducing a different race, between fork and exit.)

The stacks obtained from the hung child processes are in
libSystem_atfork_child, which is supposed to reinitialize various
parts of the C library in the new process.

One common stack dies in _notify_fork_child calling _notify_globals
(inlined) calling _os_alloc_once, because _os_alloc_once detects that
the once lock is held by the parent process and then calls
_os_once_gate_corruption_abort. The allocation is setting up the
globals for the notification subsystem. See the source code at [1].
To work around this, we can allocate the globals earlier in the Go
program's lifetime, before any execs are involved, by calling any
notify routine that is exported, calls _notify_globals, and doesn't do
anything too expensive otherwise. notify_is_valid_token(0) fits the bill.

The other common stack dies in xpc_atfork_child calling
_objc_msgSend_uncached which ends up in
WAITING_FOR_ANOTHER_THREAD_TO_FINISH_CALLING_+initialize. Of course,
whatever thread the child is waiting for is in the parent process and
is not going to finish anything in the child process. There is no
public source code for these routines, so it is unclear exactly what
the problem is. However, xpc_atfork_child turns out to be exported
(for use by libSystem_atfork_child, which is in a different library,
so xpc_atfork_child is unlikely to be unexported any time soon).
It also stands to reason that since xpc_atfork_child is called at the
start of any forked child process, it can't be too harmful to call at
the start of an ordinary Go process. And whatever caches it needs for
a non-deadlocking fast path during exec empirically do get initialized
by calling it at startup.

This CL introduces a function osinit_hack, called at osinit time,
which calls notify_is_valid_token(0) and xpc_atfork_child().
Doing so makes the os/exec test pass reliably on both my laptops -
I can run it successfully hundreds of times in a row when my previous
record was twice in a row.

Fixes #33565.
Fixes #56784.

[1] https://opensource.apple.com/source/Libnotify/Libnotify-241/notify_client.c.auto.html


Change-Id: I16a14a800893c40244678203532a3e8d6214b6bd
Reviewed-on: https://go-review.googlesource.com/c/go/+/451735
Run-TryBot: Russ Cox <rsc@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-17 21:15:36 +00:00
Cherry Mui
6e0e492e12 cmd/compile/internal/pgo: count only the last two frames as a call edge
Currently for every CPU profile sample, we apply its weight to all
call edges of the entire call stack. Frames higher up the stack
are unlikely to be repeated calls (e.g. runtime.main calling
main.main). So adding weights to call edges higher up the stack
may be not reflecting the actual call edge weights in the program.
This CL changes it to add weights to only the edge between the
last two frames.

Without a branch profile (e.g. LBR records) this is not perfect,
but seems more reasonable.

Change-Id: I0aee75cc608a152adad41c51120b661a6c542283
Reviewed-on: https://go-review.googlesource.com/c/go/+/450915
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2022-11-17 20:52:28 +00:00
Cherry Mui
3f1bcc58b3 cmd/compile: simplify PGO hot caller/callee computation
Currently, we use CDF to compute a weight threshold and then use
the weight threshold to determine whether a call site is hot. As
when we compute the CDF we already have a list of hot call sites
that make up the given percentage of the CDF, just use that list.

Also, when computing the CDF threshold, include the very last node
that makes it to go over the threshold. (I.e. if the CDF threshold
is 50% and one hot node takes 60% of weight, we should include that
node instead of excluding it. In practice it rarely matters,
probably only for testing and micro-benchmarks.)

Change-Id: I535ae9cd6b679609e247c3d0d9ee572c1a1187cc
Reviewed-on: https://go-review.googlesource.com/c/go/+/450737
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
2022-11-17 20:52:15 +00:00
Cuong Manh Le
81c9b1d65f cmd/compile: fix broken IR for iface -> eface
For implementing interface to empty interface conversion, the compiler
generate code like:

	var res *uint8
	res = itab
	if res != nil {
		res = res.type
	}

However, itab has type *uintptr, so the assignment is broken. The
problem is not shown up, until CL 450215, which call typecheck on this
broken assignment.

To fix this, just cast itab to *uint8 when doing the conversion.

Fixes #56768

Change-Id: Id42792d18e7f382578b40854d46eecd49673792c
Reviewed-on: https://go-review.googlesource.com/c/go/+/451256
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2022-11-17 19:55:28 +00:00