1
0
mirror of https://github.com/golang/go synced 2024-11-26 01:47:58 -07:00
Commit Graph

61586 Commits

Author SHA1 Message Date
Guoqi Chen
4b89120b12 cmd/internal/obj/loong64: switch Lookup function call to ABIInternal mode
CL 521790 has experimentally enabled RegABI support on Loong64, so it
is possible to switch the Lookup function call to ABIInternal mode.

Change-Id: I3ae053e20c0791efebe6b6bdc9a1550a11372bc2
Reviewed-on: https://go-review.googlesource.com/c/go/+/544435
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-11 00:08:32 +00:00
Guoqi Chen
4b0da6b13f cmd/compiler,internal/runtime/atomic: optimize And{64,32,8} and Or{64,32,8} on loong64
Use loong64's atomic operation instruction AMANDDB{V,W,W} (full barrier) to implement
And{64,32,8}, AMORDB{V,W,W} (full barrier) to implement Or{64,32,8}.

Intrinsify And{64,32,8} and Or{64,32,8}, And this CL alias all of the And/Or operations
into sync/atomic package.

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000-HV @ 2500.00MHz
                |   bench.old    |   bench.new                           |
                |   sec/op       |   sec/op        vs base               |
And32              27.73n ± 0%      10.81n ± 0%   -61.02% (p=0.000 n=20)
And32Parallel      28.96n ± 0%      12.41n ± 0%   -57.15% (p=0.000 n=20)
And64              27.73n ± 0%      10.81n ± 0%   -61.02% (p=0.000 n=20)
And64Parallel      28.96n ± 0%      12.41n ± 0%   -57.15% (p=0.000 n=20)
Or32               27.62n ± 0%      10.81n ± 0%   -60.86% (p=0.000 n=20)
Or32Parallel       28.96n ± 0%      12.41n ± 0%   -57.15% (p=0.000 n=20)
Or64               27.62n ± 0%      10.81n ± 0%   -60.86% (p=0.000 n=20)
Or64Parallel       28.97n ± 0%      12.41n ± 0%   -57.16% (p=0.000 n=20)
And8               29.15n ± 0%      13.21n ± 0%   -54.68% (p=0.000 n=20)
And                27.71n ± 0%      12.82n ± 0%   -53.74% (p=0.000 n=20)
And8Parallel       28.99n ± 0%      14.46n ± 0%   -50.12% (p=0.000 n=20)
AndParallel        29.12n ± 0%      14.42n ± 0%   -50.48% (p=0.000 n=20)
Or8                28.31n ± 0%      12.81n ± 0%   -54.75% (p=0.000 n=20)
Or                 27.72n ± 0%      12.81n ± 0%   -53.79% (p=0.000 n=20)
Or8Parallel        29.03n ± 0%      14.62n ± 0%   -49.64% (p=0.000 n=20)
OrParallel         29.12n ± 0%      14.42n ± 0%   -50.49% (p=0.000 n=20)
geomean            28.47n           12.58n        -55.80%

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
                |   bench.old    |   bench.new                          |
                |   sec/op       |   sec/op        vs base              |
And32              30.02n ± 0%      14.81n ± 0%   -50.67% (p=0.000 n=20)
And32Parallel      30.83n ± 0%      15.61n ± 0%   -49.37% (p=0.000 n=20)
And64              30.02n ± 0%      14.81n ± 0%   -50.67% (p=0.000 n=20)
And64Parallel      30.83n ± 0%      15.61n ± 0%   -49.37% (p=0.000 n=20)
And8               30.42n ± 0%      14.41n ± 0%   -52.63% (p=0.000 n=20)
And                30.02n ± 0%      13.61n ± 0%   -54.66% (p=0.000 n=20)
And8Parallel       31.23n ± 0%      15.21n ± 0%   -51.30% (p=0.000 n=20)
AndParallel        30.83n ± 0%      14.41n ± 0%   -53.26% (p=0.000 n=20)
Or32               30.02n ± 0%      14.81n ± 0%   -50.67% (p=0.000 n=20)
Or32Parallel       30.83n ± 0%      15.61n ± 0%   -49.37% (p=0.000 n=20)
Or64               30.02n ± 0%      14.82n ± 0%   -50.63% (p=0.000 n=20)
Or64Parallel       30.83n ± 0%      15.61n ± 0%   -49.37% (p=0.000 n=20)
Or8                30.02n ± 0%      14.01n ± 0%   -53.33% (p=0.000 n=20)
Or                 30.02n ± 0%      13.61n ± 0%   -54.66% (p=0.000 n=20)
Or8Parallel        30.83n ± 0%      14.81n ± 0%   -51.96% (p=0.000 n=20)
OrParallel         30.83n ± 0%      14.41n ± 0%   -53.26% (p=0.000 n=20)
geomean            30.47n           14.75n        -51.61%

Change-Id: If008ff6a08b51905076f8ddb6e92f8e214d3f7b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/482756
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-11 00:08:08 +00:00
Guoqi Chen
72a92ab5b7 cmd/compiler,internal/runtime/atomic: optimize xchg{32,64} on loong64
Use Loong64's atomic operation instruction AMSWAPDB{W,V} (full barrier)
to implement atomic.Xchg{32,64}

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
           |  old.bench    |  new.bench                          |
           |  sec/op       |  sec/op        vs base              |
Xchg          26.44n ± 0%     12.01n ± 0%   -54.58% (p=0.000 n=20)
Xchg-2        30.10n ± 0%     25.58n ± 0%   -15.02% (p=0.000 n=20)
Xchg-4        30.06n ± 0%     24.82n ± 0%   -17.43% (p=0.000 n=20)
Xchg64        26.44n ± 0%     12.02n ± 0%   -54.54% (p=0.000 n=20)
Xchg64-2      30.10n ± 0%     25.57n ± 0%   -15.05% (p=0.000 n=20)
Xchg64-4      30.05n ± 0%     24.80n ± 0%   -17.47% (p=0.000 n=20)
geomean       28.81n          19.68n        -31.69%

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
           |  old.bench    |  new.bench                          |
           |  sec/op       |  sec/op        vs base              |
Xchg          25.62n ± 0%     12.41n ± 0%  -51.56% (p=0.000 n=20)
Xchg-2        35.01n ± 0%     20.59n ± 0%  -41.19% (p=0.000 n=20)
Xchg-4        34.63n ± 0%     19.59n ± 0%  -43.42% (p=0.000 n=20)
Xchg64        25.62n ± 0%     12.41n ± 0%  -51.56% (p=0.000 n=20)
Xchg64-2      35.01n ± 0%     20.59n ± 0%  -41.19% (p=0.000 n=20)
Xchg64-4      34.67n ± 0%     19.59n ± 0%  -43.50% (p=0.000 n=20)
geomean       31.44n          17.11n       -45.59%

Updates #59120.

Change-Id: Ied74fc20338b63799c6d6eeb122c31b42cff0f7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/481578
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-11-11 00:07:51 +00:00
Youlin Feng
5123f38e05 cmd/compile: update comment for initLimit in prove pass
For: #70156

Change-Id: Ie39a88130f27b4b210ddbcf396cc0ddd2713d58b
Reviewed-on: https://go-review.googlesource.com/c/go/+/624855
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
2024-11-08 23:59:51 +00:00
Tim King
e67c0f0c8f cmd/compile/internal/noder: replace recompile library error messages
Replaces 'recompile library' error messages with the more accurate
'recompile package' globally.

Change-Id: I7247964c76f1fcb94feda37c78bdfb8a1b1a6492
Reviewed-on: https://go-review.googlesource.com/c/go/+/626696
Reviewed-by: Alan Donovan <adonovan@google.com>
Commit-Queue: Tim King <taking@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-08 21:22:55 +00:00
Ian Lance Taylor
99253ea4f4 cmd/internal/goobj: regenerate builtinlist
CL 622042 added rand as a compiler builtin, but did not update builtinlist.

Also update the mkbuiltin comment to refer to the current file location,
and add a comment for runtime.rand that it is called from the compiler.

For #54766

Change-Id: I99d2c0bb0658da333775afe2ed0447265c845c82
Reviewed-on: https://go-review.googlesource.com/c/go/+/626755
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2024-11-08 21:16:36 +00:00
Tim King
b0bbfb1e0f cmd/compile/internal/importer: drop support for indexed format
Drop support for the indexed format from the test-only Import
function.

Adds several TODOs for further tech debt reduction.

Change-Id: I45cc5ffce43082a145ccb918face067cdccc5ecd
Reviewed-on: https://go-review.googlesource.com/c/go/+/626695
Commit-Queue: Tim King <taking@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
2024-11-08 20:13:40 +00:00
Emmanuel T Odeke
64e7f66b26 encoding/json, text/template: use reflect.Value.Equal instead of ==
This change applies a fix for a reflect.Value incorrect comparison
using "==" or reflect.DeepEqual.
This change is a precursor to the change that'll bring in the
static analyzer "reflectvaluecompare", by ensuring that all tests
pass beforehand.

Updates #43993

Change-Id: I6c47eb0a1de6353ac7495cb8cb49b318b7ebba56
Reviewed-on: https://go-review.googlesource.com/c/go/+/626116
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2024-11-08 16:09:21 +00:00
Xiaolin Zhao
2b33434287 cmd/asm: use single-instruction forms for all loong64 sign and zero extensions
8-bit and 16-bit sign extensions and 32-bit zero extensions were realized
with left and right shifts before this change. We now support assembling
EXTWB, EXTWH and BSTRPICKV, so all three can be done with a single insn
respectively.

This patch is a copy of CL 479496.
Co-authored-by: WANG Xuerui <git@xen0n.name>

Change-Id: Iee5741dd9ebb25746f51008f3f6c86704339d615
Reviewed-on: https://go-review.googlesource.com/c/go/+/626195
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-08 01:06:04 +00:00
Xiaolin Zhao
e6cc9d228a cmd/compile: implement FMA codegen for loong64
Benchmark results on Loongson 3A5000 and 3A6000:

goos: linux
goarch: loong64
pkg: math
cpu: Loongson-3A6000 @ 2500.00MHz
    |  bench.old   |              bench.new              |
    |    sec/op    |   sec/op     vs base                |
FMA   25.930n ± 0%   2.002n ± 0%  -92.28% (p=0.000 n=10)

goos: linux
goarch: loong64
pkg: math
cpu: Loongson-3A5000 @ 2500.00MHz
    |  bench.old   |              bench.new              |
    |    sec/op    |   sec/op     vs base                |
FMA   32.840n ± 0%   2.002n ± 0%  -93.90% (p=0.000 n=10)

Updates #59120

This patch is a copy of CL 483355.
Co-authored-by: WANG Xuerui <git@xen0n.name>

Change-Id: I88b89d23f00864f9173a182a47ee135afec7ed6e
Reviewed-on: https://go-review.googlesource.com/c/go/+/625335
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-08 01:05:48 +00:00
Guoqi Chen
2751443e92 cmd/internal/obj/loong64: add {V,XV}PCNT.{B,H,W,D} instructions support
Go asm syntax:
          VPCNT{B,H,W,V}  VJ, VD
         XVPCNT{B,H,W,V}  XJ, XD

Equivalent platform assembler syntax:
          vpcnt.{b,w,h,d}  vd, vj
         xvpcnt.{b,w,h,d}  xd, xj

Change-Id: Icec4446b1925745bc3a0bc3f6397d862953b9098
Reviewed-on: https://go-review.googlesource.com/c/go/+/620736
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-11-08 01:05:00 +00:00
Guoqi Chen
e534989d18 cmd/compile/internal: intrinsify publicationBarrier on loong64
The publication barrier is a StoreStore barrier, which is implemented
by "DBAR 0x1A" [1] on loong64.

goos: linux
goarch: loong64
pkg: runtime
cpu: Loongson-3A6000 @ 2500.00MHz
                     |   bench.old   |  bench.new                            |
                     |    sec/op     |   sec/op        vs base               |
Malloc8                 31.76n ± 0%     22.79n ± 0%   -28.24% (p=0.000 n=20)
Malloc8-2               25.46n ± 0%     18.33n ± 0%   -28.00% (p=0.000 n=20)
Malloc8-4               25.75n ± 0%     18.43n ± 0%   -28.41% (p=0.000 n=20)
Malloc16                62.97n ± 0%     42.41n ± 0%   -32.65% (p=0.000 n=20)
Malloc16-2              49.11n ± 0%     31.68n ± 0%   -35.50% (p=0.000 n=20)
Malloc16-4              49.64n ± 1%     31.95n ± 0%   -35.62% (p=0.000 n=20)
MallocTypeInfo8         58.57n ± 0%     46.51n ± 0%   -20.61% (p=0.000 n=20)
MallocTypeInfo8-2       51.43n ± 0%     38.01n ± 0%   -26.09% (p=0.000 n=20)
MallocTypeInfo8-4       51.65n ± 0%     38.15n ± 0%   -26.13% (p=0.000 n=20)
MallocTypeInfo16        68.07n ± 0%     51.62n ± 0%   -24.17% (p=0.000 n=20)
MallocTypeInfo16-2      54.73n ± 0%     41.13n ± 0%   -24.85% (p=0.000 n=20)
MallocTypeInfo16-4      55.05n ± 0%     41.28n ± 0%   -25.02% (p=0.000 n=20)
MallocLargeStruct       491.5n ± 0%     454.8n ± 0%    -7.47% (p=0.000 n=20)
MallocLargeStruct-2     351.8n ± 1%     323.8n ± 0%    -7.94% (p=0.000 n=20)
MallocLargeStruct-4     333.6n ± 0%     316.7n ± 0%    -5.10% (p=0.000 n=20)
geomean                 71.01n          53.78n        -24.26%

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: Ica0c89db6f2bebd55d9b3207a1c462a9454e9268
Reviewed-on: https://go-review.googlesource.com/c/go/+/577515
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-08 01:04:43 +00:00
Guoqi Chen
4f7af5d192 cmd/compiler,internal/runtime/atomic: optimize xadd{32,64} on loong64
Use Loong64's atomic operation instruction AMADDDB{W,V} (full barrier)
to implement atomic.Xadd{32,64}

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
          |  bench.old    |  bench.new                            |
          |  sec/op       |  sec/op          vs base              |
Xadd         27.24n ± 0%     12.01n ± 0%    -55.91% (p=0.000 n=20)
Xadd-2       31.93n ± 0%     25.55n ± 0%    -19.98% (p=0.000 n=20)
Xadd-4       31.90n ± 0%     24.80n ± 0%    -22.26% (p=0.000 n=20)
Xadd64       27.23n ± 0%     12.01n ± 0%    -55.89% (p=0.000 n=20)
Xadd64-2     31.93n ± 0%     25.57n ± 0%    -19.90% (p=0.000 n=20)
Xadd64-4     31.89n ± 0%     24.80n ± 0%    -22.23% (p=0.000 n=20)
geomean      30.27n          19.67n         -35.01%

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
          |  bench.old    |  bench.new                           |
          |  sec/op       |  sec/op         vs base              |
Xadd         26.02n ± 0%     12.41n ± 0%   -52.31% (p=0.000 n=20)
Xadd-2       37.36n ± 0%     20.60n ± 0%   -44.86% (p=0.000 n=20)
Xadd-4       37.22n ± 0%     19.59n ± 0%   -47.37% (p=0.000 n=20)
Xadd64       26.42n ± 0%     12.41n ± 0%   -53.03% (p=0.000 n=20)
Xadd64-2     37.77n ± 0%     20.60n ± 0%   -45.46% (p=0.000 n=20)
Xadd64-4     37.78n ± 0%     19.59n ± 0%   -48.15% (p=0.000 n=20)
geomean      33.30n          17.11n        -48.62%

Change-Id: I982539c2aa04680e9dd11b099ba8d5f215bf9b32
Reviewed-on: https://go-review.googlesource.com/c/go/+/481937
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
2024-11-08 01:04:28 +00:00
changwang ma
acad0c2e9a cmd/go/internal/lockedfile: fix function name in error message for test
Change-Id: I1477c6249196dba58908ff8cc881914bf602ddd8
Reviewed-on: https://go-review.googlesource.com/c/go/+/622615
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
Auto-Submit: Sam Thanawalla <samthanawalla@google.com>
2024-11-07 19:57:37 +00:00
Cherry Mui
bccd686f9d runtime/cgo: use pthread_getattr_np on Android
It is defined in bionic libc since at least API level 3. Use it.

Updates #68285.

Change-Id: I215c2d61d5612e7c0298b2cb69875690f8fbea66
Reviewed-on: https://go-review.googlesource.com/c/go/+/626275
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2024-11-07 19:31:01 +00:00
Felix Geisendörfer
411ba0ae86 runtime/pprof: add label benchmark
Add several benchmarks for pprof labels to analyze the impact of
follow-up CLs.

Change-Id: Ifae39cfe83ec93858fce9e3af6c1be024ba76736
Reviewed-on: https://go-review.googlesource.com/c/go/+/574515
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2024-11-07 19:24:54 +00:00
Russ Cox
4582f239c3 cmd/internal/objabi, cmd/link: introduce SymKind helper methods
These will be necessary when we start using the new FIPS symbols.
Split into a separate CL so that these refactoring changes can be
tested separate from any FIPS-specific changes.

Passes golang.org/x/tools/cmd/toolstash/buildall.

Change-Id: I73e5873fcb677f1f572f0668b4dc6f3951d822bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/625996
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2024-11-07 17:47:42 +00:00
Russ Cox
43f889b9e5 cmd/internal/objabi, cmd/link: add FIPS symbol kinds
Add FIPS symbol kinds that will be needed for FIPS support.
This is a separate CL to keep the re-generated changes in
the string methods separate from hand-written changes.

The separate symbol kinds will let us group the FIPS-related
code and data together, so that it can be checksummed at
startup, as required by FIPS.

It's also separate because it breaks buildall, by changing the
on-disk symbol kind enumeration. We want non-buildall
changes to be as simple as possible.

For #69536.

Change-Id: I2d5a238498929fff8b24736ee54330c17323bd86
Reviewed-on: https://go-review.googlesource.com/c/go/+/625995
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07 17:47:38 +00:00
benbaker76
2e97c30d8d debug/elf: add SHT_GNU_VERDEF section parsing
Fixes #63952

Change-Id: Icf93e57e62243d9c3306d4e1c5dadb3f62747710
GitHub-Last-Rev: 5c29527600
GitHub-Pull-Request: golang/go#69850
Reviewed-on: https://go-review.googlesource.com/c/go/+/619077
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-07 15:23:24 +00:00
Keith Randall
fc5e8f2f6b runtime/race: treat map concurrent access detection as a race detector hit
Sometimes the runtime realizes there is a race before the race detector does.
Maybe that's a bug in the race detector? But we should probably handle it.

Update #70164
(Fixes? I'm not sure.)

Change-Id: Ie7e8bf2b06701368e0551b4a1aa40f6746bbddd8
Reviewed-on: https://go-review.googlesource.com/c/go/+/626036
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-07 15:14:49 +00:00
Russ Cox
daa6c9310e cmd/link: remove dummy argument from ld.Errorf
As the comment notes, all calls to Errorf now pass nil,
so remove that argument entirely.

There is a TODO to remove uses of Errorf entirely, but
that seems wrong: sometimes there is no symbol on
which to report the error, and in that situation, Errorf is
appropriate. So clarify that in the docs.

Change-Id: I92b3b6e8e3f61ba8356ace8cd09573d0b55d7869
Reviewed-on: https://go-review.googlesource.com/c/go/+/625617
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07 12:17:55 +00:00
Russ Cox
5b20eec8a0 cmd/internal/obj: replace obj.Addrel func with LSym.AddRel method
The old API was to do

	r := obj.AddRel(sym)
	r.Type = this
	r.Off = that
	etc

The new API is:

	sym.AddRel(ctxt, obj.Reloc{Type: this: Off: that, etc})

This new API is more idiomatic and avoids ever having relocations
that are only partially constructed. Most importantly, it sets up
for sym.AddRel being able to check relocation validity in the future.
(Passing ctxt is for use in validity checking.)

Passes golang.org/x/tools/cmd/toolstash/buildall.

Change-Id: I042ea76e61bb3bf6402f98ca11291a13f4799972
Reviewed-on: https://go-review.googlesource.com/c/go/+/625616
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07 12:17:10 +00:00
Guoqi Chen
4ce8c0604e cmd/internal/obj/loong64: add {V,XV}SEQ.{B,H,W,D} instructions support
Go asm syntax:
         VSEQ{B,H,W,V}  VJ, VK, VD
        XVSEQ{B,H,W,V}  XJ, XK, XD

Equivalent platform assembler syntax:
         vseq.{b,w,h,d}  vd, vj, vk
        xvseq.{b,w,h,d}  xd, xj, xk

Change-Id: Ia87277b12c817ebc41a46f4c3d09f4b76995ff2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/616076
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07 02:20:26 +00:00
Guoqi Chen
751a817ccc cmd/internal/obj/loong64: add {V,XV}LD/{V,XV}LDX/{V,XV}ST/{V,XV}STX instructions support
This CL adding primitive asm support of Loong64 LSX [1] and LASX [2], by introducing new
sets of register V0-V31 (C_VREG), X0-X31 (C_XREG) and 8 new instructions.

On Loong64, VLD,XVLD,VST,XVST implement vector memory access operations using immediate
values offset. VLDX, XVLDX, VSTX, XVSTX implement vector memory access operations using
register offset.

Go asm syntax:
        VMOVQ           n(RJ), RV      (128bit vector load)
        XVMOVQ          n(RJ), RX      (256bit vector load)
        VMOVQ           RV, n(RJ)      (128bit vector store)
        XVMOVQ          RX, n(RJ)      (256bit vector store)

        VMOVQ           (RJ)(RK), RV   (128bit vector load)
        XVMOVQ          (RJ)(RK), RX   (256bit vector load)
        VMOVQ           RV, (RJ)(RK)   (128bit vector store)
        XVMOVQ          RX, (RJ)(RK)   (256bit vector store)

Equivalent platform assembler syntax:
         vld            vd, rj, si12
        xvld            xd, rj, si12
         vst            vd, rj, si12
        xvst            xd, rj, si12
         vldx           vd, rj, rk
        xvldx           xd, rj, rk
         vstx           vd, rj, rk
        xvstx           xd, rj, rk

[1]: LSX: Loongson SIMD Extension, 128bit
[2]: LASX: Loongson Advanced SIMD Extension, 256bit

Change-Id: Ibaf5ddfd29b77670c3c44cc32bead36b2c8b8003
Reviewed-on: https://go-review.googlesource.com/c/go/+/616075
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07 02:20:14 +00:00
Guoqi Chen
ac345fb7e7 cmd/compiler,internal/runtime/atomic: optimize Store{64,32,8} on loong64
On Loong64, AMSWAPDB{W,V} instructions are supported by default, and AMSWAPDB{B,H} [1]
is a new instruction added by LA664(Loongson 3A6000) and later microarchitectures.
Therefore, AMSWAPDB{W,V} (full barrier) is used to implement AtomicStore{32,64}, and
the traditional MOVB or the new AMSWAPDBB is used to implement AtomicStore8 according
to the CPU feature.

The StoreRelease barrier on Loong64 is "dbar 0x12", but it is still necessary to
ensure consistency in the order of Store/Load [2].

LoweredAtomicStorezero{32,64} was removed because on loong64 the constant "0" uses
the R0 register, and there is no performance difference between the implementations
of LoweredAtomicStorezero{32,64} and LoweredAtomicStore{32,64}.

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000-HV @ 2500.00MHz
                |  bench.old  |              bench.new              |
                |   sec/op    |   sec/op     vs base                |
AtomicStore64     19.61n ± 0%   13.61n ± 0%  -30.60% (p=0.000 n=20)
AtomicStore64-2   19.61n ± 0%   13.61n ± 0%  -30.57% (p=0.000 n=20)
AtomicStore64-4   19.62n ± 0%   13.61n ± 0%  -30.63% (p=0.000 n=20)
AtomicStore       19.61n ± 0%   13.61n ± 0%  -30.60% (p=0.000 n=20)
AtomicStore-2     19.62n ± 0%   13.61n ± 0%  -30.63% (p=0.000 n=20)
AtomicStore-4     19.62n ± 0%   13.62n ± 0%  -30.58% (p=0.000 n=20)
AtomicStore8      19.61n ± 0%   20.01n ± 0%   +2.04% (p=0.000 n=20)
AtomicStore8-2    19.62n ± 0%   20.02n ± 0%   +2.01% (p=0.000 n=20)
AtomicStore8-4    19.61n ± 0%   20.02n ± 0%   +2.09% (p=0.000 n=20)
geomean           19.61n        15.48n       -21.08%

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
                |  bench.old  |              bench.new              |
                |   sec/op    |   sec/op     vs base                |
AtomicStore64     18.03n ± 0%   12.81n ± 0%  -28.93% (p=0.000 n=20)
AtomicStore64-2   18.02n ± 0%   12.81n ± 0%  -28.91% (p=0.000 n=20)
AtomicStore64-4   18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore       18.02n ± 0%   12.81n ± 0%  -28.91% (p=0.000 n=20)
AtomicStore-2     18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore-4     18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore8      18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore8-2    18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore8-4    18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
geomean           18.01n        12.81n       -28.89%

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/loongarch/sync.md

Change-Id: I4ae5e8dd0e6f026129b6e503990a763ed40c6097
Reviewed-on: https://go-review.googlesource.com/c/go/+/581356
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2024-11-07 02:19:55 +00:00
Nigel Tao
9088883cf4 image/jpeg: initialize dct_test constants at compile time
Doing so is slightly more accurate than calculating at run time (because
of float64 rounding errors): https://go.dev/play/p/hrOzHDLjd5K

Having these more accurate values isn't necessary for tests to pass, but
it's helpful if doing printf-debugging or stepping through the code.

Change-Id: I07a65678936e4db05b11f9d8d952b32b2acd51a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/624716
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Nigel Tao <nigeltao@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-06 21:57:39 +00:00
Srinivas Pokala
eb29beb0ad cmd/objdump: add s390x plan9 disasm support
This CL provides vendor support for s390x disassembler plan9 syntax.

cd $GOROOT/src/cmd
go get golang.org/x/arch@master
go mod tidy
go mod vendor

For #15255

Change-Id: I20c87510a1aee2d1cf2df58feb535974c4c0e3ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/623075
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-06 21:43:43 +00:00
Damien Neil
23493579ea net/http: 308 redirects should use the previous hop's body
On a 301 redirect, the HTTP client changes the request to be
a GET with no body.

On a 308 redirect, the client leaves the request method and
body unchanged.

A 308 following a 301 should preserve the rewritten request
from the first redirect: GET with no body. We were preserving
the method, but sending the original body. Fix this.

Fixes #70180

Change-Id: Ie20027a6058a82bfdffc7197d07ac6c7f98099e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/626055
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-06 21:01:09 +00:00
Filippo Valsorda
2c7b5ba8ca crypto/internal/fips: fix Avo generators
They needed their package names updated after packages were moved to
crypto/internal/fips. Also, mitigated mmcloughlin/avo#450 which would
require setting GOARCH=amd64 at generation time.

Change-Id: Ib903ef113ebb5a24844204f231f2507cea03a67e
Reviewed-on: https://go-review.googlesource.com/c/go/+/626075
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2024-11-06 20:07:30 +00:00
Russ Cox
840ac5e037 context: listen on localhost in example
Listening on ":0" triggers a Mac firewall box while the test runs.

Change-Id: Ie6f8eb07eb76ea222f43bc40b1c30645294bc239
Reviewed-on: https://go-review.googlesource.com/c/go/+/625975
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-11-06 18:10:22 +00:00
Russ Cox
72801623cb cmd/link/internal/ld: fix sort comparison
Strictly speaking, the sort comparison was inconsistent
(and therefore invalid) for the sort-by-name case, if you had

	a size 0
	b size 1
	c size 0
	zerobase

That would result in the inconsistent comparison ordering:

	a < b (by name)
	b < c (by name)
	c < zerobase (by zerobase rule)
	zerobase < b (by zerobase rule)

This can't happen today because we only disable size-based
sort in a segment that has no zerobase symbol, but it's
confusing to reason through that, so clean up the code anyway.

Passes golang.org/x/tools/cmd/toolstash/buildall.

Change-Id: I21e4159cdedd2053952ba960530d1b0f28c6fb24
Reviewed-on: https://go-review.googlesource.com/c/go/+/625615
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-06 17:34:44 +00:00
qmuntal
7fff741016 syscall: mark SyscallN as noescape
syscall.SyscallN is implemented by runtime.syscall_syscalln, which makes
sure that the variadic argument doesn't escape.

There is no need to worry about the lifetime of the elements of the
variadic argument, as the compiler will keep them live until the
function returns.

Fixes #70197.

Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest,gotip-windows-amd64-race
Change-Id: I12991f0be12062eea68f2b103fa0a794c1b527eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/625297
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-06 16:35:17 +00:00
Damien Neil
067d58b534 net/http: handle new HTTP/2 error for 1xx limit exceeded
CL 615295 changed the error message produced by the HTTP/2
implementation when a server sends more 1xx headers than expected.
Update a test that checks for this error.

For #65035

Change-Id: I57e22f6a880412e3a448e58693127540806d5ddb
Reviewed-on: https://go-review.googlesource.com/c/go/+/625195
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-06 16:12:22 +00:00
Xiaolin Zhao
d6fb0ab2c7 cmd/compile: wire up Bswap/ReverseBytes intrinsics for loong64
Micro-benchmark results on Loongson 3A5000 and 3A6000:

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000 @ 2500.00MHz
               |  bench.old   |              bench.new               |
               |    sec/op    |    sec/op     vs base                |
ReverseBytes     2.0020n ± 0%   0.4040n ± 0%  -79.82% (p=0.000 n=20)
ReverseBytes16   0.8866n ± 1%   0.8007n ± 0%   -9.69% (p=0.000 n=20)
ReverseBytes32   1.2195n ± 0%   0.8007n ± 0%  -34.34% (p=0.000 n=20)
ReverseBytes64   2.0705n ± 0%   0.8008n ± 0%  -61.32% (p=0.000 n=20)
geomean           1.455n        0.6749n       -53.62%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
               |  bench.old   |              bench.new               |
               |    sec/op    |    sec/op     vs base                |
ReverseBytes     2.8040n ± 0%   0.5205n ± 0%  -81.44% (p=0.000 n=20)
ReverseBytes16   0.7066n ± 0%   0.8011n ± 0%  +13.37% (p=0.000 n=20)
ReverseBytes32   1.5500n ± 0%   0.8010n ± 0%  -48.32% (p=0.000 n=20)
ReverseBytes64   2.7665n ± 0%   0.8010n ± 0%  -71.05% (p=0.000 n=20)
geomean           1.707n        0.7192n       -57.87%

Updates #59120

This patch is a copy of CL 483357.
Co-authored-by: WANG Xuerui <git@xen0n.name>

Change-Id: If355354cd031533df91991fcc3392e5a6c314295
Reviewed-on: https://go-review.googlesource.com/c/go/+/624576
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-06 03:12:50 +00:00
Xiaolin Zhao
d98c51809d cmd/compile: wire up math/bits.Len intrinsics for loong64
For the SubFromLen64 codegen test case to work as intended, we need
to fold c-(-(x-d)) into x+(c-d).

Still, some instances of LeadingZeros are not optimized into single
CLZ instructions right now (actually, the LeadingZeros micro-benchmarks
are currently still compiled with redundant adds/subs of 64, due to
interference of loop optimizations before lowering), but perf numbers
indicate it's not that bad after all.

Micro-benchmark results on Loongson 3A5000 and 3A6000:

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
               |  bench.old  |              bench.new              |
               |   sec/op    |   sec/op     vs base                |
LeadingZeros     3.660n ± 0%   1.348n ± 0%  -63.17% (p=0.000 n=20)
LeadingZeros8    1.777n ± 0%   1.767n ± 0%   -0.56% (p=0.000 n=20)
LeadingZeros16   2.816n ± 0%   1.770n ± 0%  -37.14% (p=0.000 n=20)
LeadingZeros32   5.293n ± 1%   1.683n ± 0%  -68.21% (p=0.000 n=20)
LeadingZeros64   3.622n ± 0%   1.349n ± 0%  -62.76% (p=0.000 n=20)
geomean          3.229n        1.571n       -51.35%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000 @ 2500.00MHz
               |  bench.old   |              bench.new               |
               |    sec/op    |    sec/op     vs base                |
LeadingZeros      2.410n ± 0%    1.103n ± 1%  -54.23% (p=0.000 n=20)
LeadingZeros8     1.236n ± 0%    1.501n ± 0%  +21.44% (p=0.000 n=20)
LeadingZeros16    2.106n ± 0%    1.501n ± 0%  -28.73% (p=0.000 n=20)
LeadingZeros32    2.860n ± 0%    1.324n ± 0%  -53.72% (p=0.000 n=20)
LeadingZeros64   2.6135n ± 0%   0.9509n ± 0%  -63.62% (p=0.000 n=20)
geomean           2.159n         1.256n       -41.81%

Updates #59120

This patch is a copy of CL 483356.
Co-authored-by: WANG Xuerui <git@xen0n.name>

Change-Id: Iee81a17f7da06d77a427e73dfcc016f2b15ae556
Reviewed-on: https://go-review.googlesource.com/c/go/+/624575
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2024-11-06 00:40:40 +00:00
Dmitri Shuralyov
671f2841cb all: update golang.org/x/sys [generated]
A part of the keeping Go's vendored dependencies and generated code
up to date.

For #36905.

[git-generate]
cd src
go get golang.org/x/sys@v0.26.1-0.20241105152852-e0753d469443
go mod tidy
go mod vendor
cd cmd
go get golang.org/x/sys@v0.26.1-0.20241105152852-e0753d469443
go mod tidy
go mod vendor
go generate syscall internal/syscall/...

Change-Id: Ia84505f8934399f7c4518c6218892b81d30e5c17
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/623821
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Bypass: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-11-05 23:51:47 +00:00
Damien Neil
bfc8f28068 net/http: add Protocols field to Server and Transport
Support configuring which HTTP version(s) a server or client use
via an explicit set of protocols. The Protocols field takes
precedence over TLSNextProto and ForceAttemptHTTP2.

Fixes #67814

Change-Id: I09ece88f78ad4d98ca1f213157b5f62ae11e063f
Reviewed-on: https://go-review.googlesource.com/c/go/+/607496
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2024-11-05 22:14:59 +00:00
Sam Thanawalla
635c2dce04 cmd/go: add built in git mode for GOAUTH
This CL adds support for git as a valid GOAUTH command.
Improves on implementation in cmd/auth/gitauth/gitauth.go
This follows the proposed design in
https://golang.org/issues/26232#issuecomment-461525141

For #26232
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Change-Id: I07810d9dc895d59e5db4bfa50cd42cb909208f93
Reviewed-on: https://go-review.googlesource.com/c/go/+/605275
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
2024-11-05 20:08:39 +00:00
Oleksandr Redko
3b94c357f8 io: simplify tests by removing redundant statements
Change-Id: I4bcaa6b42571626c88e3374c328bbfe993476242
Reviewed-on: https://go-review.googlesource.com/c/go/+/625295
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-05 19:52:23 +00:00
Russ Cox
1e740c7669 cmd/compile: fix an internal crash in embed
Observed in the telemetry data. Was causing truncated error outputs.

Change-Id: I9f0a86e1e6caa855f97a3d6e51328c4c9685c937
Reviewed-on: https://go-review.googlesource.com/c/go/+/623535
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2024-11-05 18:54:20 +00:00
Ian Lance Taylor
d92e8fe25c io/fs: clarify that "." may only be used for root
For #70155

Change-Id: I648791c484e19bb12c6e4f84e2dc42eaaa4db546
Reviewed-on: https://go-review.googlesource.com/c/go/+/624595
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2024-11-05 18:06:01 +00:00
Markus
dacf253afa net/internal/cgotest: don't try to use cgo with netgo build tag
When using bazel with hermetic_cc_toolchain resolv.h is not available.

Change-Id: I2aed72e6c14535cb1400b30d285bf05aa2498fde
GitHub-Last-Rev: 818c72323d
GitHub-Pull-Request: golang/go#70141
Reviewed-on: https://go-review.googlesource.com/c/go/+/623816
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Mateusz Poliwczak <mpoliwczak34@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
2024-11-05 17:10:13 +00:00
Kristóf Havasi
293a205f3c internal/platform: fix 'reportsr' typo in comment
Gets rendered at https://pkg.go.dev/internal/platform#Broken

Change-Id: I128d9f02f113b1b326bc7d6a0e48fe0c944546dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/624915
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-05 17:09:38 +00:00
cuishuang
08d2403576 time: add examples for Since, Until, Abs and fix some comments
Change-Id: I33b61629dfabffa15065a14fccdb418bab11350d
Reviewed-on: https://go-review.googlesource.com/c/go/+/623915
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
2024-11-05 17:09:35 +00:00
Youlin Feng
cb163ff60b cmd/compile: init limit for newly created value in prove pass
Fixes: #70156

Change-Id: I2e5dc2a39a8e54ec5f18c5f9d1644208cffb2e9a
Reviewed-on: https://go-review.googlesource.com/c/go/+/624695
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-11-05 16:55:14 +00:00
qiulaidongfeng
4f092a9f34 cmd/go: fix typo in ExtraEnvVarsCostly
For #69994

Change-Id: I7db39074f6a055efb29c1cbd0db4c286864a5da6
Reviewed-on: https://go-review.googlesource.com/c/go/+/621996
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: bcd a <letmetellsomediff@gmail.com>
Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Sam Thanawalla <samthanawalla@google.com>
2024-11-05 15:59:15 +00:00
Youlin Feng
0140aae6d0 cmd/compile: optimize Ctz64 on 386
Compared with the version generated by dec64.rules based on Ctz32,
the number of assembly instructions is reduced by half.

SwissMap uses TrailingZeros64 to find the first match in its control
group and may benefit from this CL on 386 architectures.

goos: linux
goarch: 386
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
                   │   old.txt    │               new.txt                │
                   │    sec/op    │    sec/op     vs base                │
TrailingZeros64-20   0.8828n ± 1%   0.6299n ± 1%  -28.65% (p=0.000 n=20)

Change-Id: Iba08a3f4e13efd3349715dfb7fcd5fd470286cd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/624376
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
2024-11-05 15:30:57 +00:00
Dmitri Shuralyov
bea9b91f0f time: accept "+01" in TestLoadFixed on OpenBSD
This stops the test from failing with a known failure mode, and
creates time to look into what the next steps should be, if any.

For #69840.

Change-Id: I060903d256ed65c5dfcd70ae76eb361cab63186f
Cq-Include-Trybots: luci.golang.try:gotip-openbsd-amd64
Reviewed-on: https://go-review.googlesource.com/c/go/+/625197
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Eric Grosse <grosse@gmail.com>
2024-11-05 01:16:03 +00:00
Xiaolin Zhao
5f88755f43 cmd/compile: add loong64-specific inlining for runtime.memmove
goos: linux
goarch: loong64
pkg: runtime
cpu: Loongson-3A6000 @ 2500.00MHz
                                 |   bench.old   |               bench.new                |
                                 |    sec/op     |    sec/op     vs base                  |
Memmove/0                          0.8004n ±  0%   0.4002n ± 0%  -50.00% (p=0.000 n=20)
Memmove/1                           2.494n ±  0%    2.136n ± 0%  -14.35% (p=0.000 n=20)
Memmove/2                           2.802n ±  0%    2.512n ± 0%  -10.35% (p=0.000 n=20)
Memmove/3                           2.802n ±  0%    2.497n ± 0%  -10.92% (p=0.000 n=20)
Memmove/4                           3.202n ±  0%    2.808n ± 0%  -12.30% (p=0.000 n=20)
Memmove/5                           2.821n ±  0%    2.658n ± 0%   -5.76% (p=0.000 n=20)
Memmove/6                           2.819n ±  0%    2.657n ± 0%   -5.73% (p=0.000 n=20)
Memmove/7                           2.820n ±  0%    2.654n ± 0%   -5.87% (p=0.000 n=20)
Memmove/8                           3.202n ±  0%    2.814n ± 0%  -12.12% (p=0.000 n=20)
Memmove/9                           3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/10                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/11                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/12                          3.202n ±  0%    3.010n ± 0%   -6.01% (p=0.000 n=20)
Memmove/13                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/14                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/15                          3.202n ±  0%    3.010n ± 0%   -6.01% (p=0.000 n=20)
Memmove/16                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/32                          3.602n ±  0%    3.603n ± 0%   +0.03% (p=0.000 n=20)
Memmove/64                          4.202n ±  0%    4.204n ± 0%   +0.05% (p=0.000 n=20)
Memmove/128                         8.005n ±  0%    8.007n ± 0%   +0.02% (p=0.000 n=20)
Memmove/256                         11.21n ±  0%    10.81n ± 0%   -3.57% (p=0.000 n=20)
Memmove/512                         17.65n ±  0%    17.96n ± 0%   +1.73% (p=0.000 n=20)
Memmove/1024                        30.48n ±  0%    30.46n ± 0%   -0.07% (p=0.000 n=20)
Memmove/2048                        56.43n ±  0%    56.30n ± 0%   -0.24% (p=0.000 n=20)
Memmove/4096                        107.7n ±  0%    107.6n ± 0%   -0.09% (p=0.000 n=20)
MemmoveOverlap/32                   4.002n ±  0%    4.003n ± 0%   +0.02% (p=0.002 n=20)
MemmoveOverlap/64                   4.603n ±  0%    4.603n ± 0%        ~ (p=0.286 n=20)
MemmoveOverlap/128                  8.704n ±  0%    8.699n ± 0%        ~ (p=0.180 n=20)
MemmoveOverlap/256                  12.01n ±  0%    11.76n ± 0%   -2.08% (p=0.000 n=20)
MemmoveOverlap/512                  18.42n ±  0%    18.36n ± 0%   -0.33% (p=0.000 n=20)
MemmoveOverlap/1024                 31.23n ±  0%    31.16n ± 0%   -0.21% (p=0.000 n=20)
MemmoveOverlap/2048                 57.42n ±  0%    56.82n ± 0%   -1.04% (p=0.000 n=20)
MemmoveOverlap/4096                 108.5n ±  0%    108.0n ± 0%   -0.46% (p=0.000 n=20)
MemmoveUnalignedDst/0               2.804n ±  0%    2.447n ± 0%  -12.70% (p=0.000 n=20)
MemmoveUnalignedDst/1               2.802n ±  0%    2.491n ± 0%  -11.12% (p=0.000 n=20)
MemmoveUnalignedDst/2               3.202n ±  0%    2.808n ± 0%  -12.29% (p=0.000 n=20)
MemmoveUnalignedDst/3               3.202n ±  0%    2.814n ± 0%  -12.12% (p=0.000 n=20)
MemmoveUnalignedDst/4               3.602n ±  0%    3.202n ± 0%  -11.10% (p=0.000 n=20)
MemmoveUnalignedDst/5               3.202n ±  0%    3.203n ± 0%   +0.03% (p=0.014 n=20)
MemmoveUnalignedDst/6               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/7               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/8               3.602n ±  0%    3.202n ± 0%  -11.10% (p=0.000 n=20)
MemmoveUnalignedDst/9               3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/10              3.602n ±  0%    3.602n ± 0%        ~ (p=0.091 n=20)
MemmoveUnalignedDst/11              3.602n ±  0%    3.602n ± 0%        ~ (p=0.613 n=20)
MemmoveUnalignedDst/12              3.602n ±  0%    3.602n ± 0%        ~ (p=0.165 n=20)
MemmoveUnalignedDst/13              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/14              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/15              3.602n ±  0%    3.602n ± 0%    0.00% (p=0.027 n=20)
MemmoveUnalignedDst/16              3.602n ±  0%    3.602n ± 0%        ~ (p=0.661 n=20)
MemmoveUnalignedDst/32              4.002n ±  0%    4.002n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/64              6.804n ±  0%    6.804n ± 0%        ~ (p=0.204 n=20)
MemmoveUnalignedDst/128             12.61n ±  0%    12.61n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedDst/256             16.33n ±  2%    16.32n ± 2%        ~ (p=0.839 n=20)
MemmoveUnalignedDst/512             25.61n ±  0%    24.71n ± 0%   -3.51% (p=0.000 n=20)
MemmoveUnalignedDst/1024            42.81n ±  0%    42.82n ± 0%        ~ (p=0.973 n=20)
MemmoveUnalignedDst/2048            74.86n ±  0%    76.03n ± 0%   +1.56% (p=0.000 n=20)
MemmoveUnalignedDst/4096            152.0n ± 11%    152.0n ± 0%    0.00% (p=0.013 n=20)
MemmoveUnalignedDstOverlap/32       5.319n ±  0%    5.558n ± 1%   +4.50% (p=0.000 n=20)
MemmoveUnalignedDstOverlap/64       8.006n ±  0%    8.025n ± 0%   +0.24% (p=0.000 n=20)
MemmoveUnalignedDstOverlap/128      9.631n ±  0%    9.601n ± 0%   -0.31% (p=0.000 n=20)
MemmoveUnalignedDstOverlap/256      13.79n ±  2%    13.58n ± 1%        ~ (p=0.234 n=20)
MemmoveUnalignedDstOverlap/512      21.38n ±  0%    21.30n ± 0%   -0.37% (p=0.000 n=20)
MemmoveUnalignedDstOverlap/1024     41.71n ±  0%    41.70n ± 0%        ~ (p=0.887 n=20)
MemmoveUnalignedDstOverlap/2048     81.63n ±  0%    81.61n ± 0%        ~ (p=0.481 n=20)
MemmoveUnalignedDstOverlap/4096     162.6n ±  0%    162.6n ± 0%        ~ (p=0.171 n=20)
MemmoveUnalignedSrc/0               2.808n ±  0%    2.482n ± 0%  -11.61% (p=0.000 n=20)
MemmoveUnalignedSrc/1               2.804n ±  0%    2.577n ± 0%   -8.08% (p=0.000 n=20)
MemmoveUnalignedSrc/2               3.202n ±  0%    2.806n ± 0%  -12.37% (p=0.000 n=20)
MemmoveUnalignedSrc/3               3.202n ±  0%    2.808n ± 0%  -12.30% (p=0.000 n=20)
MemmoveUnalignedSrc/4               3.602n ±  0%    3.202n ± 0%  -11.10% (p=0.000 n=20)
MemmoveUnalignedSrc/5               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/6               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/7               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/8               3.602n ±  0%    3.202n ± 0%  -11.10% (p=0.000 n=20)
MemmoveUnalignedSrc/9               3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/10              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/11              3.602n ±  0%    3.602n ± 0%        ~ (p=0.746 n=20)
MemmoveUnalignedSrc/12              3.602n ±  0%    3.602n ± 0%        ~ (p=0.407 n=20)
MemmoveUnalignedSrc/13              3.603n ±  0%    3.602n ± 0%   -0.03% (p=0.001 n=20)
MemmoveUnalignedSrc/14              3.603n ±  0%    3.602n ± 0%   -0.01% (p=0.013 n=20)
MemmoveUnalignedSrc/15              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/16              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/32              4.002n ±  0%    4.002n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/64              4.803n ±  0%    4.803n ± 0%    0.00% (p=0.008 n=20)
MemmoveUnalignedSrc/128             8.405n ±  0%    8.405n ± 0%    0.00% (p=0.003 n=20)
MemmoveUnalignedSrc/256             12.04n ±  3%    12.20n ± 2%        ~ (p=0.151 n=20)
MemmoveUnalignedSrc/512             19.11n ±  0%    19.10n ± 3%        ~ (p=0.621 n=20)
MemmoveUnalignedSrc/1024            35.62n ±  0%    35.62n ± 0%        ~ (p=0.407 n=20)
MemmoveUnalignedSrc/2048            68.04n ±  0%    68.35n ± 0%   +0.46% (p=0.000 n=20)
MemmoveUnalignedSrc/4096            133.2n ±  1%    133.3n ± 0%        ~ (p=0.131 n=20)
MemmoveUnalignedSrcDst/f_16_0       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_16_0       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_16_1       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_16_1       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_16_4       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_16_4       4.202n ±  0%    4.202n ± 0%        ~ (p=0.661 n=20)
MemmoveUnalignedSrcDst/f_16_7       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_16_7       4.203n ±  0%    4.202n ± 0%   -0.02% (p=0.008 n=20)
MemmoveUnalignedSrcDst/f_64_0       6.103n ±  0%    6.100n ± 0%        ~ (p=0.595 n=20)
MemmoveUnalignedSrcDst/b_64_0       6.103n ±  0%    6.102n ± 0%        ~ (p=0.973 n=20)
MemmoveUnalignedSrcDst/f_64_1       7.419n ±  0%    7.226n ± 0%   -2.59% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_64_1       6.745n ±  0%    6.941n ± 0%   +2.89% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_64_4       7.420n ±  0%    7.223n ± 0%   -2.65% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_64_4       6.753n ±  0%    6.941n ± 0%   +2.79% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_64_7       7.423n ±  0%    7.204n ± 0%   -2.96% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_64_7       6.750n ±  0%    6.941n ± 0%   +2.83% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_256_0      12.96n ±  0%    12.99n ± 0%   +0.27% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_256_0      12.91n ±  0%    12.94n ± 0%   +0.23% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_256_1      17.21n ±  0%    17.21n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_256_1      17.61n ±  0%    17.61n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_256_4      16.21n ±  0%    16.21n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_256_4      16.41n ±  0%    16.41n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_256_7      14.12n ±  0%    14.10n ± 0%        ~ (p=0.307 n=20)
MemmoveUnalignedSrcDst/b_256_7      14.81n ±  0%    14.81n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedSrcDst/f_4096_0     109.3n ±  0%    109.4n ± 0%   +0.09% (p=0.004 n=20)
MemmoveUnalignedSrcDst/b_4096_0     109.6n ±  0%    109.6n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_4096_1     113.5n ±  0%    113.5n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_4096_1     113.7n ±  0%    113.7n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedSrcDst/f_4096_4     112.3n ±  0%    112.3n ± 0%        ~ (p=0.763 n=20)
MemmoveUnalignedSrcDst/b_4096_4     112.6n ±  0%    112.9n ± 1%   +0.31% (p=0.032 n=20)
MemmoveUnalignedSrcDst/f_4096_7     110.6n ±  0%    110.6n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedSrcDst/b_4096_7     111.1n ±  0%    111.1n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedSrcDst/f_65536_0    4.801µ ±  0%    4.818µ ± 0%   +0.34% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_65536_0    5.027µ ±  0%    5.036µ ± 0%   +0.19% (p=0.007 n=20)
MemmoveUnalignedSrcDst/f_65536_1    4.815µ ±  0%    4.729µ ± 0%   -1.78% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_65536_1    4.659µ ±  0%    4.737µ ± 1%   +1.69% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_65536_4    4.807µ ±  0%    4.721µ ± 0%   -1.78% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_65536_4    4.659µ ±  0%    4.601µ ± 0%   -1.23% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_65536_7    4.868µ ±  0%    4.759µ ± 0%   -2.23% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_65536_7    4.665µ ±  0%    4.709µ ± 0%   +0.93% (p=0.000 n=20)
MemmoveUnalignedSrcOverlap/32       6.804n ±  0%    6.810n ± 0%   +0.09% (p=0.000 n=20)
MemmoveUnalignedSrcOverlap/64       10.41n ±  0%    10.42n ± 0%   +0.10% (p=0.000 n=20)
MemmoveUnalignedSrcOverlap/128      11.59n ±  0%    11.58n ± 0%        ~ (p=0.414 n=20)
MemmoveUnalignedSrcOverlap/256      14.22n ±  0%    14.29n ± 0%   +0.46% (p=0.000 n=20)
MemmoveUnalignedSrcOverlap/512      23.11n ±  0%    23.04n ± 0%   -0.28% (p=0.001 n=20)
MemmoveUnalignedSrcOverlap/1024     41.44n ±  0%    41.47n ± 0%        ~ (p=0.693 n=20)
MemmoveUnalignedSrcOverlap/2048     81.25n ±  0%    81.25n ± 0%        ~ (p=0.405 n=20)
MemmoveUnalignedSrcOverlap/4096     166.1n ±  0%    166.1n ± 0%        ~ (p=0.451 n=20)
geomean                             13.02n          12.69n        -2.51%
¹ all samples are equal

Change-Id: I712adc7670f6ae360714ec5a770d00d76c8700ed
Reviewed-on: https://go-review.googlesource.com/c/go/+/618815
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2024-11-05 00:44:11 +00:00
Xiaolin Zhao
47a48ebf34 hash/crc32: optimize the loong64 crc32 implementation
Make use of the newly added LA64 CRC32 instructions to accelerate
computation of CRC32 with IEEE and Castagnoli polynomials.

Benchmarks:
goos: linux
goarch: loong64
pkg: hash/crc32
cpu: Loongson-3A6000 @ 2500.00MHz
                                        |  bench.old   |              bench.new              |
                                        |    sec/op    |   sec/op     vs base                |
CRC32/poly=IEEE/size=15/align=0            63.35n ± 0%   15.80n ± 0%  -75.06% (p=0.000 n=20)
CRC32/poly=IEEE/size=15/align=1            63.35n ± 0%   16.42n ± 0%  -74.08% (p=0.000 n=20)
CRC32/poly=IEEE/size=40/align=0            65.40n ± 0%   19.22n ± 0%  -70.61% (p=0.000 n=20)
CRC32/poly=IEEE/size=40/align=1            65.40n ± 0%   19.23n ± 0%  -70.60% (p=0.000 n=20)
CRC32/poly=IEEE/size=512/align=0          407.30n ± 0%   66.86n ± 0%  -83.58% (p=0.000 n=20)
CRC32/poly=IEEE/size=512/align=1          407.30n ± 0%   66.86n ± 0%  -83.58% (p=0.000 n=20)
CRC32/poly=IEEE/size=1kB/align=0           778.2n ± 0%   118.1n ± 0%  -84.82% (p=0.000 n=20)
CRC32/poly=IEEE/size=1kB/align=1           778.2n ± 0%   118.1n ± 0%  -84.82% (p=0.000 n=20)
CRC32/poly=IEEE/size=4kB/align=0          3004.0n ± 0%   425.6n ± 0%  -85.83% (p=0.000 n=20)
CRC32/poly=IEEE/size=4kB/align=1          3004.0n ± 0%   425.6n ± 0%  -85.83% (p=0.000 n=20)
CRC32/poly=IEEE/size=32kB/align=0         23.775µ ± 0%   3.305µ ± 0%  -86.10% (p=0.000 n=20)
CRC32/poly=IEEE/size=32kB/align=1         23.774µ ± 0%   3.305µ ± 0%  -86.10% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=15/align=0      63.58n ± 0%   15.28n ± 0%  -75.97% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=15/align=1      63.58n ± 0%   16.95n ± 0%  -73.34% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=40/align=0      65.29n ± 0%   17.04n ± 0%  -73.90% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=40/align=1      65.29n ± 0%   19.05n ± 0%  -70.83% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=512/align=0    407.20n ± 0%   55.06n ± 0%  -86.48% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=512/align=1    407.20n ± 0%   56.44n ± 0%  -86.14% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=1kB/align=0    778.10n ± 0%   95.08n ± 0%  -87.78% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=1kB/align=1    778.10n ± 0%   97.72n ± 0%  -87.44% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=4kB/align=0    3004.0n ± 0%   338.5n ± 0%  -88.73% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=4kB/align=1    3004.0n ± 0%   341.1n ± 0%  -88.64% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=32kB/align=0   23.775µ ± 0%   2.623µ ± 0%  -88.97% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=32kB/align=1   23.775µ ± 0%   2.896µ ± 0%  -87.82% (p=0.000 n=20)
CRC32/poly=Koopman/size=15/align=0         63.11n ± 0%   63.11n ± 0%        ~ (p=0.737 n=20)
CRC32/poly=Koopman/size=15/align=1         63.11n ± 0%   63.11n ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=40/align=0         153.2n ± 0%   153.2n ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=40/align=1         153.2n ± 0%   153.2n ± 0%        ~ (p=0.737 n=20)
CRC32/poly=Koopman/size=512/align=0        1.854µ ± 0%   1.854µ ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=512/align=1        1.854µ ± 0%   1.854µ ± 0%        ~ (p=0.737 n=20)
CRC32/poly=Koopman/size=1kB/align=0        3.699µ ± 0%   3.699µ ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=1kB/align=1        3.699µ ± 0%   3.699µ ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=4kB/align=0        14.77µ ± 0%   14.77µ ± 0%        ~ (p=0.495 n=20)
CRC32/poly=Koopman/size=4kB/align=1        14.77µ ± 0%   14.77µ ± 0%        ~ (p=0.704 n=20)
CRC32/poly=Koopman/size=32kB/align=0       118.1µ ± 0%   118.1µ ± 0%        ~ (p=0.057 n=20)
CRC32/poly=Koopman/size=32kB/align=1       118.1µ ± 0%   118.1µ ± 0%        ~ (p=0.493 n=20)
geomean                                    1.001µ        306.8n       -69.35%

goos: linux
goarch: loong64
pkg: hash/crc32
cpu: Loongson-3A5000 @ 2500.00MHz
                                        |  bench.old  |              bench.new              |
                                        |   sec/op    |   sec/op     vs base                |
CRC32/poly=IEEE/size=15/align=0           75.70n ± 1%   47.04n ± 1%  -37.86% (p=0.000 n=20)
CRC32/poly=IEEE/size=15/align=1           75.70n ± 1%   46.64n ± 1%  -38.39% (p=0.000 n=20)
CRC32/poly=IEEE/size=40/align=0           89.26n ± 0%   65.49n ± 0%  -26.63% (p=0.000 n=20)
CRC32/poly=IEEE/size=40/align=1           89.09n ± 0%   72.55n ± 1%  -18.56% (p=0.000 n=20)
CRC32/poly=IEEE/size=512/align=0          621.0n ± 0%   513.5n ± 0%  -17.31% (p=0.000 n=20)
CRC32/poly=IEEE/size=512/align=1          621.0n ± 0%   521.9n ± 0%  -15.96% (p=0.000 n=20)
CRC32/poly=IEEE/size=1kB/align=0          1.204µ ± 0%   1.001µ ± 0%  -16.86% (p=0.000 n=20)
CRC32/poly=IEEE/size=1kB/align=1          1.205µ ± 0%   1.009µ ± 0%  -16.27% (p=0.000 n=20)
CRC32/poly=IEEE/size=4kB/align=0          4.665µ ± 0%   3.923µ ± 0%  -15.91% (p=0.000 n=20)
CRC32/poly=IEEE/size=4kB/align=1          4.665µ ± 0%   3.931µ ± 0%  -15.73% (p=0.000 n=20)
CRC32/poly=IEEE/size=32kB/align=0         36.97µ ± 0%   31.20µ ± 0%  -15.60% (p=0.000 n=20)
CRC32/poly=IEEE/size=32kB/align=1         36.96µ ± 0%   31.21µ ± 0%  -15.57% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=15/align=0     75.72n ± 1%   48.07n ± 1%  -36.52% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=15/align=1     75.70n ± 1%   46.99n ± 2%  -37.93% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=40/align=0     87.91n ± 0%   64.89n ± 0%  -26.19% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=40/align=1     87.91n ± 0%   72.12n ± 1%  -17.97% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=512/align=0    619.8n ± 0%   514.3n ± 0%  -17.02% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=512/align=1    619.8n ± 0%   521.7n ± 0%  -15.83% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=1kB/align=0    1.202µ ± 0%   1.001µ ± 0%  -16.72% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=1kB/align=1    1.202µ ± 0%   1.009µ ± 0%  -16.06% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=4kB/align=0    4.663µ ± 0%   3.924µ ± 0%  -15.85% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=4kB/align=1    4.663µ ± 0%   3.931µ ± 0%  -15.70% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=32kB/align=0   36.96µ ± 0%   31.20µ ± 0%  -15.60% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=32kB/align=1   36.96µ ± 0%   31.21µ ± 0%  -15.57% (p=0.000 n=20)
CRC32/poly=Koopman/size=15/align=0        74.91n ± 1%   74.95n ± 1%        ~ (p=0.963 n=20)
CRC32/poly=Koopman/size=15/align=1        74.91n ± 1%   75.02n ± 1%        ~ (p=0.909 n=20)
CRC32/poly=Koopman/size=40/align=0        165.0n ± 0%   165.0n ± 0%        ~ (p=0.865 n=20)
CRC32/poly=Koopman/size=40/align=1        165.1n ± 0%   165.0n ± 0%        ~ (p=0.342 n=20)
CRC32/poly=Koopman/size=512/align=0       1.867µ ± 0%   1.867µ ± 0%        ~ (p=0.320 n=20)
CRC32/poly=Koopman/size=512/align=1       1.867µ ± 0%   1.867µ ± 0%        ~ (p=0.782 n=20)
CRC32/poly=Koopman/size=1kB/align=0       3.712µ ± 0%   3.712µ ± 0%        ~ (p=0.859 n=20)
CRC32/poly=Koopman/size=1kB/align=1       3.712µ ± 0%   3.713µ ± 0%        ~ (p=0.175 n=20)
CRC32/poly=Koopman/size=4kB/align=0       14.79µ ± 0%   14.79µ ± 0%        ~ (p=0.826 n=20)
CRC32/poly=Koopman/size=4kB/align=1       14.79µ ± 0%   14.79µ ± 0%        ~ (p=0.169 n=20)
CRC32/poly=Koopman/size=32kB/align=0      118.1µ ± 0%   118.1µ ± 0%        ~ (p=0.941 n=20)
CRC32/poly=Koopman/size=32kB/align=1      118.1µ ± 0%   118.1µ ± 0%        ~ (p=0.473 n=20)
geomean                                   1.299µ        1.109µ       -14.68%

Performance of poly=Koopman is not affected.

This patch is a copy of CL 478596.
Co-authored-by: WANG Xuerui <git@xen0n.name>

Change-Id: I345192cdf693f21fe1015a8b8361ca68ac780c9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/624355
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2024-11-05 00:43:58 +00:00