mirror of
https://github.com/golang/go
synced 2024-11-13 14:00:27 -07:00
6382893890
58561 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Joel Sing
|
4e896d179d |
runtime: remove map stack version handling for openbsd
OpenBSD 6.3 is more than five years old and has not been supported for the last four years (only 7.3 and 7.4 are currently supported). As such, remove special handling of MAP_STACK for 6.3 and earlier. Change-Id: I1086c910bbcade7fb3938bb1226813212794b587 Reviewed-on: https://go-review.googlesource.com/c/go/+/538458 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Aaron Bieber <aaron@bolddaemon.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Joel Sing <joel@sing.id.au> |
||
Robert Griesemer
|
6a7ef36466 |
test: run range-over-integer tests without need for -goexperiment
Move the range-over-function tests into range4.go. Change-Id: Idccf30a0c7d7e8d2a17fb1c5561cf21e00506135 Reviewed-on: https://go-review.googlesource.com/c/go/+/539095 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> Run-TryBot: Robert Griesemer <gri@google.com> |
||
Robert Griesemer
|
11677d983e |
go/types, types2: enable range over int w/o need for goexperiment
For #61405. Change-Id: I047ec31bc36b1707799ffef25506070613477d1f Reviewed-on: https://go-review.googlesource.com/c/go/+/538718 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com> |
||
Robert Griesemer
|
e5ef484691 |
spec: document range over integer expression
This CL is partly based on CL 510535. For #61405. Change-Id: Ic94f6726f9eb34313f11bec7b651921d7e5c18d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/538859 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Bypass: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com> |
||
Joe Tsai
|
08b2f1f761 |
os: fix PathError.Op for dirFS.Open
This appears to be a copy-paste error from CL 455362. The operation name used to be "open" but seems to have been accidentally changed to "stat". This CL reverts back to "open". Change-Id: I3fc5168095e2d9eee3efa3cc091b10bcf4e3ecde Reviewed-on: https://go-review.googlesource.com/c/go/+/539056 Run-TryBot: Joseph Tsai <joetsai@digital-static.net> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Joseph Tsai <joetsai@digital-static.net> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Damien Neil <dneil@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
qiulaidongfeng
|
23711f8ef7 |
internal/bytealg: optimize indexbyte in amd64
goos: windows
goarch: amd64
pkg: bytes
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
IndexByte/10-16 2.613n ± 1% 2.558n ± 1% -2.09% (p=0.014 n=10)
IndexByte/32-16 3.034n ± 1% 3.010n ± 2% ~ (p=0.305 n=10)
IndexByte/4K-16 57.20n ± 2% 39.58n ± 2% -30.81% (p=0.000 n=10)
IndexByte/4M-16 34.48µ ± 1% 33.83µ ± 2% -1.87% (p=0.023 n=10)
IndexByte/64M-16 1.493m ± 2% 1.450m ± 2% -2.89% (p=0.000 n=10)
IndexBytePortable/10-16 3.172n ± 4% 3.163n ± 2% ~ (p=0.684 n=10)
IndexBytePortable/32-16 8.465n ± 2% 8.375n ± 3% ~ (p=0.631 n=10)
IndexBytePortable/4K-16 852.0n ± 1% 846.6n ± 3% ~ (p=0.971 n=10)
IndexBytePortable/4M-16 868.2µ ± 2% 856.6µ ± 2% ~ (p=0.393 n=10)
IndexBytePortable/64M-16 13.81m ± 2% 13.88m ± 3% ~ (p=0.684 n=10)
geomean 1.204µ 1.148µ -4.63%
│ old.txt │ new.txt │
│ B/s │ B/s vs base │
IndexByte/10-16 3.565Gi ± 1% 3.641Gi ± 1% +2.15% (p=0.015 n=10)
IndexByte/32-16 9.821Gi ± 1% 9.899Gi ± 2% ~ (p=0.315 n=10)
IndexByte/4K-16 66.70Gi ± 2% 96.39Gi ± 2% +44.52% (p=0.000 n=10)
IndexByte/4M-16 113.3Gi ± 1% 115.5Gi ± 2% +1.91% (p=0.023 n=10)
IndexByte/64M-16 41.85Gi ± 2% 43.10Gi ± 2% +2.98% (p=0.000 n=10)
IndexBytePortable/10-16 2.936Gi ± 4% 2.945Gi ± 2% ~ (p=0.684 n=10)
IndexBytePortable/32-16 3.521Gi ± 2% 3.559Gi ± 3% ~ (p=0.631 n=10)
IndexBytePortable/4K-16 4.477Gi ± 1% 4.506Gi ± 3% ~ (p=0.971 n=10)
IndexBytePortable/4M-16 4.499Gi ± 2% 4.560Gi ± 2% ~ (p=0.393 n=10)
IndexBytePortable/64M-16 4.525Gi ± 2% 4.504Gi ± 3% ~ (p=0.684 n=10)
geomean 10.04Gi 10.53Gi +4.86%
For #63678
Change-Id: I0571c2b540a816d57bd6ed8bb1df4191c7992d92
GitHub-Last-Rev:
|
||
dchaofei
|
ea3010d994 |
crypto/x509: optimize the performance of checkSignature
The loop should be terminated immediately when `algo` has been found
Fixes #52955
Change-Id: Ib3865c4616a0c1af9b72daea45f5a1750f84562f
GitHub-Last-Rev:
|
||
Jes Cok
|
a05a25cb19 |
bytes,internal/bytealg: add func bytealg.LastIndexRabinKarp
Also rename 'substr' to 'sep' in IndexRabinKarp for consistency.
Change-Id: Icc2ad1116aecaf002c8264daa2fa608306c9a88a
GitHub-Last-Rev:
|
||
Bryan C. Mills
|
0330aad038 |
os: report IO_REPARSE_TAG_DEDUP files as regular in Stat and Lstat
Prior to CL 460595, Lstat reported most reparse points as regular files. However, reparse points can in general implement unusual behaviors (consider IO_REPARSE_TAG_AF_UNIX or IO_REPARSE_TAG_LX_CHR), and Windows allows arbitrary user-defined reparse points, so in general we must not assume that an unrecognized reparse tag represents a regular file; in CL 460595, we began marking them as irregular. As it turns out, the Data Deduplication service on Windows Server runs an Optimization job that turns regular files into reparse files with the tag IO_REPARSE_TAG_DEDUP. Those files still behave more-or-less like regular files, in that they have well-defined sizes and support random-access reads and writes, so most programs can treat them as regular files without difficulty. However, they are still reparse files: as a result, on servers with the Data Deduplication service enabled, files could arbitrarily change from “regular” to “irregular” without explicit user intervention. Since dedup files are converted in the background and otherwise behave like regular files, this change adds a special case to report DEDUP reparse points as regular. Fixes #63429. No test because to my knowledge we don't have any Windows builders that have the deduplication service enabled, nor do we have a way to reliably guarantee the existence of an IO_REPARSE_TAG_DEDUP file. (In theory we could add a builder with the service enabled on a specific volume, write a test that encodes knowledge of that volume, and use the GO_BUILDER_NAME environment variable to run that test only on the specially-configured builders. However, I don't currently have the bandwidth to reconfigure the builders in this way, and given the simplicity of the change I think it is unlikely to regress accidentally.) Change-Id: I649e7ef0b67e3939a980339ce7ec6a20b31b23a1 Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/537915 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Quim Muntal <quimmuntal@gmail.com> Auto-Submit: Bryan Mills <bcmills@google.com> |
||
Robert Griesemer
|
b7a695bd68 |
cmd/compile/internal/syntax: better error messages for incorrect type parameter list
When parsing a declaration of the form type a [b[c]]d where a, b, c, d stand for identifiers, b[c] is parsed as a type constraint (because an array length must be constant and an index expression b[c] is never constant, even if b is a constant string and c a constant index - this is crucial for disambiguation of the various possibilities). As a result, the error message referred to a missing type parameter name and not an invalid array declaration. Recognize this special case and report both possibilities (because we can't be sure without type information) with the new error: "missing type parameter name or invalid array length" ALso, change the previous error message "type parameter must be named" to "missing type parameter name" which is more fitting as the error refers to an absent type parameter (rather than a type parameter that's somehow invisibly present but unnamed). Fixes #60812. Change-Id: Iaad3b3a9aeff9dfe2184779f3d799f16c7500b34 Reviewed-on: https://go-review.googlesource.com/c/go/+/538856 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> Reviewed-by: Robert Findley <rfindley@google.com> |
||
Robert Griesemer
|
34a5830c26 |
cmd/compile/internal/syntax: fix/update various comments
Change-Id: I30b448c8fcdbad94afcd7ff0dfc5cfebb485bdd7 Reviewed-on: https://go-review.googlesource.com/c/go/+/538855 Auto-Submit: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com> Reviewed-by: Robert Findley <rfindley@google.com> |
||
Joel Sing
|
0aa2197279 |
os/signal: use syscall.Wait4 directly in tests
Rather than using syscall.Syscall6 with SYS_WAIT4, use syscall.Wait4 directly. Updates #59667 Change-Id: I50fea3b7d10003dbc632aafd5e170a9fe96d6f42 Reviewed-on: https://go-review.googlesource.com/c/go/+/538459 Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
Joel Sing
|
1a58fd0fda |
syscall: regen zsyscall for openbsd/riscv64
This removes the unused writelen function, which was cleaned up for other platforms in CL#529035. Change-Id: I1999dc81276763bdc73d8590c16729447c4e8538 Reviewed-on: https://go-review.googlesource.com/c/go/+/538119 Reviewed-by: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Run-TryBot: Joel Sing <joel@sing.id.au> |
||
Joel Sing
|
6ecadb4d87 |
syscall: regenerate zsyscall for dragonfly/freebsd/netbsd
The sysctl declaration was moved in CL 141639, however the files were presumably not regenerated. There is no functional change, however regenerating avoids unrelated noise in future diffs. Change-Id: Ifb840b5853f3f1c3c88a3f94df21b6f6d3c635d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/538118 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Cherry Mui <cherryyz@google.com> |
||
apocelipes
|
e73e25b624 |
internal/cpu: add comments to copied functions
Just as same as other copied functions,
like stringsTrimSuffix in "os/executable_procfs.go"
Change-Id: I9c9fbd75b009a5ae0e869cf1fddc77c0e08d9a67
GitHub-Last-Rev:
|
||
Cherry Mui
|
d2f3a68bf0 |
runtime: use testenv.Command in TestG0StackOverflow
For debugging timeouts. Change-Id: I08dc86ec0264196e5fd54066655e94a9d062ed80 Reviewed-on: https://go-review.googlesource.com/c/go/+/538697 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Bryan Mills <bcmills@google.com> |
||
Keith Randall
|
b11defeaed |
runtime: make select fairness test less picky
Allow up to 10 standard deviations from the mean, instead of ~5 that the current test allows. 10 standard deviations allows up to a 4500/5500 split. Fixes #52465 Change-Id: Icb21c1d31fafbcf4723b75435ba5e98863e812c4 Reviewed-on: https://go-review.googlesource.com/c/go/+/538815 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Keith Randall
|
962ccbef91 |
cmd/compile: ensure pointer arithmetic happens after the nil check
Have nil checks return a pointer that is known non-nil. Users of that pointer can use the result, ensuring that they are ordered after the nil check itself. The order dependence goes away after scheduling, when we've fixed an order. At that point we move uses back to the original pointer so it doesn't change regalloc any. This prevents pointer arithmetic on nil from being spilled to the stack and then observed by a stack scan. Fixes #63657 Change-Id: I1a5fa4f2e6d9000d672792b4f90dfc1b7b67f6ea Reviewed-on: https://go-review.googlesource.com/c/go/+/537775 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> |
||
Keith Randall
|
43b57b8516 |
cmd/compile: handle constant pointer offsets in dead store elimination
Update #63657 Update #45573 Change-Id: I163c6038c13d974dc0ca9f02144472bc05331826 Reviewed-on: https://go-review.googlesource.com/c/go/+/538595 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com> |
||
Keith Randall
|
66b8107a26 |
runtime: on arm32, detect whether we have sync instructions
Make the choice of using these instructions dynamic (triggered by cpu feature detection) rather than static (trigered by GOARM setting). if GOARM>=7, we know we have them. For GOARM=5/6, dynamically dispatch based on auxv information. Update #17082 Update #61588 Change-Id: I8a50481d942f62cf36348998a99225d0d242f8af Reviewed-on: https://go-review.googlesource.com/c/go/+/525637 TryBot-Result: Gopher Robot <gobot@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Run-TryBot: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> |
||
Mateusz Poliwczak
|
dd84bb6824 |
crypto/x509: add new OID type and use it in Certificate
Fixes #60665
Change-Id: I814b7d4b26b964f74443584fb2048b3e27e3b675
GitHub-Last-Rev:
|
||
Jes Cok
|
68e52bc03c |
bytes,internal/bytealg: eliminate IndexRabinKarpBytes using generics
This is a follow-up to CL 538175.
Change-Id: Iec2523b36a16d7e157c17858c89fcd43c2470d58
GitHub-Last-Rev:
|
||
Jes Cok
|
cbc403af1d |
cmd/compile/internal/ssa: adjust default to the end in *Block.AuxIntString
Change-Id: Id48cade7811e2dfbf78d3171fe202ad272534e37
GitHub-Last-Rev:
|
||
Cuong Manh Le
|
3dea7c3f69 |
hash/maphash: weaken avalanche test a bit more
CL 495415 weaken avalanche, making allowed range from 43% to 57%. Since then, we only see a failure with 58% on linux-386-longtest builder, so let give the test a bit more wiggle room: 40% to 59%. Fixes #60170 Change-Id: I9528ebc8601975b733c3d9fd464ce41429654273 Reviewed-on: https://go-review.googlesource.com/c/go/+/538655 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> |
||
cui fliter
|
289b823ac9 |
internal/bytealg: optimize Count/CountString in arm64
For #63678 goos: darwin goarch: arm64 pkg: strings │ count_old.txt │ count_new.txt │ │ sec/op │ sec/op vs base │ CountHard1-8 368.7µ ± 11% 332.0µ ± 1% -9.95% (p=0.002 n=10) CountHard2-8 348.8µ ± 5% 333.1µ ± 1% -4.51% (p=0.000 n=10) CountHard3-8 402.7µ ± 25% 359.5µ ± 1% -10.75% (p=0.000 n=10) CountTorture-8 10.536µ ± 23% 9.913µ ± 0% -5.91% (p=0.000 n=10) CountTortureOverlapping-8 74.86µ ± 9% 67.56µ ± 1% -9.75% (p=0.000 n=10) CountByte/10-8 6.905n ± 3% 6.690n ± 1% -3.11% (p=0.001 n=10) CountByte/32-8 3.247n ± 13% 3.207n ± 2% -1.23% (p=0.030 n=10) CountByte/4096-8 83.72n ± 1% 82.58n ± 1% -1.36% (p=0.007 n=10) CountByte/4194304-8 85.17µ ± 5% 84.02µ ± 8% ~ (p=0.075 n=10) CountByte/67108864-8 1.497m ± 8% 1.397m ± 2% -6.69% (p=0.000 n=10) geomean 9.977µ 9.426µ -5.53% │ count_old.txt │ count_new.txt │ │ B/s │ B/s vs base │ CountByte/10-8 1.349Gi ± 3% 1.392Gi ± 1% +3.20% (p=0.002 n=10) CountByte/32-8 9.180Gi ± 11% 9.294Gi ± 2% +1.24% (p=0.029 n=10) CountByte/4096-8 45.57Gi ± 1% 46.20Gi ± 1% +1.38% (p=0.007 n=10) CountByte/4194304-8 45.86Gi ± 5% 46.49Gi ± 7% ~ (p=0.075 n=10) CountByte/67108864-8 41.75Gi ± 8% 44.74Gi ± 2% +7.16% (p=0.000 n=10) geomean 16.10Gi 16.55Gi +2.85% Change-Id: Ifc2173ba3a926b0fa9598372d4404b8645929d45 Reviewed-on: https://go-review.googlesource.com/c/go/+/538116 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: shuang cui <imcusg@gmail.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Joel Sing
|
e293c4b509 |
runtime: allocate crash stack via stackalloc
On some platforms (notably OpenBSD), stacks must be specifically allocated and marked as being stack memory. Allocate the crash stack using stackalloc, which ensures these requirements are met, rather than using a global Go variable. Fixes #63794 Change-Id: I6513575797dd69ff0a36f3bfd4e5fc3bd95cbf50 Reviewed-on: https://go-review.googlesource.com/c/go/+/538457 Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
Robert Griesemer
|
b7a66be69c |
cmd/compile/internal/syntax: set up dummy name and type if func name is missing
We do the same elsewhere (e.g. in parser.name when a name is missing). This ensures functions have a (dummy) name and a non-nil type. Avoids a crash in the type-checker (verified manually). A test was added here (rather than the type checker) because type- checker tests are shared between types2 and go/types and error recovery in this case is different. Fixes #63835. Change-Id: I1460fc88d23d80b8d8c181c774d6b0a56ca06317 Reviewed-on: https://go-review.googlesource.com/c/go/+/538059 Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Bypass: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com> Run-TryBot: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> |
||
Robert Griesemer
|
25a59decd5 |
go/types, types2: more concise error if conversion fails due to integer overflow
This change brings the error message for this case back in line with the pre-Go1.18 error message. Fixes #63563. Change-Id: I3c6587d420907b34ee8a5f295ecb231e9f008380 Reviewed-on: https://go-review.googlesource.com/c/go/+/538058 Auto-Submit: Robert Griesemer <gri@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Robert Griesemer <gri@google.com> Run-TryBot: Robert Griesemer <gri@google.com> TryBot-Bypass: Robert Griesemer <gri@google.com> Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com> |
||
Joel Sing
|
b6a3c0273e |
cmd/dist,internal/platform: enable openbsd/ppc64 port
Updates #56001 Change-Id: I16440114ecf661e9fc17d304ab3b16bc97ef82f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/517935 Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org> |
||
Jes Cok
|
f215a0be4d |
cmd/compile/internal/ssa: add missing space in comment
Change-Id: I54c3e8e0d61ceb6533284098dc32944f9f14459e
GitHub-Last-Rev:
|
||
qiulaidongfeng
|
9c2ab20d48 |
internal/fmtsort: makeChans pin pointer
Complete TODO.
For #49431
Change-Id: I1399205e430ebd83182c3e0c4becf1fde32d433e
GitHub-Last-Rev:
|
||
Quan Tong
|
214ce28503 |
cmd/go/internal/help: update the documentation to match the design and implementation
The existing documentation imply that the build constraints should be ignored after a block comments, but actually it's not. Fixes #63502 Change-Id: I0597934b7a7eeab8908bf06e1312169b3702bf05 Reviewed-on: https://go-review.googlesource.com/c/go/+/535635 Reviewed-by: Michael Matloob <matloob@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Mark Pictor <mark.pictor@contrastsecurity.com> Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
||
Allen Li
|
1e95fc7ffe |
log/slog: Reorder doc comment for level constants
pkgsite and go doc print the doc comment *after* the code, resulting in:
const (
LevelDebug Level = -4
...
)
Many paragraphs...
Names for common levels.
The "Names for common levels." feels out of place and confusing at the bottom.
This is also consistent with the recommendation for the first sentence in doc comments to be the "summary".
Change-Id: I656e85e27d2a4b23eaba5f2c1f4f811a88848c83
GitHub-Last-Rev:
|
||
Russ Cox
|
8abde68f19 |
math/rand/v2: delete Mitchell/Reeds source
These slowdowns are because we are now using PCG instead of the Mitchell/Reeds LFSR for the benchmarks. PCG is in fact a bit slower (but generates statically far better random numbers). goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 01ff938549.amd64 │ afa459a2f0.amd64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 1.490n ± 0% 1.488n ± 2% ~ (p=0.408 n=20) SourceUint64-32 1.352n ± 1% 1.450n ± 3% +7.21% (p=0.000 n=20) GlobalInt64-32 2.083n ± 0% 2.067n ± 2% ~ (p=0.223 n=20) GlobalInt64Parallel-32 0.1035n ± 1% 0.1044n ± 2% ~ (p=0.010 n=20) GlobalUint64-32 2.038n ± 1% 2.085n ± 0% +2.28% (p=0.000 n=20) GlobalUint64Parallel-32 0.1006n ± 1% 0.1008n ± 1% ~ (p=0.733 n=20) Int64-32 1.687n ± 2% 1.779n ± 1% +5.48% (p=0.000 n=20) Uint64-32 1.674n ± 2% 1.854n ± 2% +10.69% (p=0.000 n=20) GlobalIntN1000-32 3.135n ± 1% 3.140n ± 3% ~ (p=0.794 n=20) IntN1000-32 2.478n ± 1% 2.496n ± 1% +0.73% (p=0.006 n=20) Int64N1000-32 2.455n ± 1% 2.510n ± 2% +2.22% (p=0.000 n=20) Int64N1e8-32 2.467n ± 2% 2.471n ± 2% ~ (p=0.050 n=20) Int64N1e9-32 2.454n ± 1% 2.488n ± 2% +1.39% (p=0.000 n=20) Int64N2e9-32 2.482n ± 1% 2.478n ± 2% ~ (p=0.066 n=20) Int64N1e18-32 3.349n ± 2% 3.088n ± 1% -7.81% (p=0.000 n=20) Int64N2e18-32 3.537n ± 1% 3.493n ± 1% -1.24% (p=0.002 n=20) Int64N4e18-32 4.917n ± 0% 5.060n ± 2% +2.91% (p=0.000 n=20) Int32N1000-32 2.386n ± 1% 2.620n ± 1% +9.76% (p=0.000 n=20) Int32N1e8-32 2.366n ± 1% 2.652n ± 0% +12.11% (p=0.000 n=20) Int32N1e9-32 2.355n ± 2% 2.644n ± 1% +12.32% (p=0.000 n=20) Int32N2e9-32 2.371n ± 1% 2.619n ± 2% +10.48% (p=0.000 n=20) Float32-32 2.245n ± 2% 2.261n ± 1% ~ (p=0.625 n=20) Float64-32 2.235n ± 1% 2.241n ± 2% ~ (p=0.393 n=20) ExpFloat64-32 3.813n ± 3% 3.716n ± 1% -2.53% (p=0.000 n=20) NormFloat64-32 3.652n ± 2% 3.718n ± 1% +1.79% (p=0.006 n=20) Perm3-32 33.12n ± 3% 34.11n ± 2% ~ (p=0.021 n=20) Perm30-32 205.1n ± 1% 200.6n ± 0% -2.17% (p=0.000 n=20) Perm30ViaShuffle-32 110.8n ± 1% 109.7n ± 1% -0.99% (p=0.002 n=20) ShuffleOverhead-32 113.0n ± 1% 107.2n ± 1% -5.09% (p=0.000 n=20) Concurrent-32 2.100n ± 0% 2.108n ± 6% ~ (p=0.103 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ 01ff938549.arm64 │ afa459a2f0.arm64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-8 2.531n ± 0% 2.531n ± 0% ~ (p=0.763 n=20) SourceUint64-8 2.258n ± 1% 2.531n ± 0% +12.09% (p=0.000 n=20) GlobalInt64-8 2.167n ± 0% 2.177n ± 1% ~ (p=0.213 n=20) GlobalInt64Parallel-8 0.4310n ± 0% 0.4319n ± 0% ~ (p=0.027 n=20) GlobalUint64-8 2.182n ± 1% 2.185n ± 1% ~ (p=0.683 n=20) GlobalUint64Parallel-8 0.4297n ± 0% 0.4295n ± 1% ~ (p=0.941 n=20) Int64-8 2.472n ± 1% 4.104n ± 0% +66.00% (p=0.000 n=20) Uint64-8 2.449n ± 1% 4.080n ± 0% +66.60% (p=0.000 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 1% ~ (p=0.972 n=20) IntN1000-8 2.998n ± 2% 4.140n ± 0% +38.09% (p=0.000 n=20) Int64N1000-8 2.949n ± 2% 4.139n ± 0% +40.35% (p=0.000 n=20) Int64N1e8-8 2.953n ± 2% 4.140n ± 0% +40.22% (p=0.000 n=20) Int64N1e9-8 2.950n ± 0% 4.139n ± 0% +40.32% (p=0.000 n=20) Int64N2e9-8 2.946n ± 2% 4.140n ± 0% +40.53% (p=0.000 n=20) Int64N1e18-8 3.779n ± 1% 5.273n ± 0% +39.52% (p=0.000 n=20) Int64N2e18-8 4.370n ± 1% 6.059n ± 0% +38.65% (p=0.000 n=20) Int64N4e18-8 6.544n ± 1% 8.803n ± 0% +34.52% (p=0.000 n=20) Int32N1000-8 2.950n ± 0% 4.131n ± 0% +40.06% (p=0.000 n=20) Int32N1e8-8 2.950n ± 2% 4.131n ± 0% +40.03% (p=0.000 n=20) Int32N1e9-8 2.951n ± 2% 4.131n ± 0% +39.99% (p=0.000 n=20) Int32N2e9-8 2.950n ± 2% 4.131n ± 0% +40.03% (p=0.000 n=20) Float32-8 3.441n ± 0% 4.110n ± 0% +19.44% (p=0.000 n=20) Float64-8 3.442n ± 0% 4.104n ± 0% +19.24% (p=0.000 n=20) ExpFloat64-8 4.481n ± 0% 5.338n ± 0% +19.11% (p=0.000 n=20) NormFloat64-8 4.725n ± 0% 5.731n ± 0% +21.28% (p=0.000 n=20) Perm3-8 26.55n ± 0% 26.62n ± 0% +0.28% (p=0.000 n=20) Perm30-8 181.9n ± 0% 194.6n ± 2% +6.98% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 156.4n ± 0% +9.45% (p=0.000 n=20) ShuffleOverhead-8 120.8n ± 2% 125.8n ± 0% +4.10% (p=0.000 n=20) Concurrent-8 2.421n ± 6% 2.654n ± 6% +9.67% (p=0.002 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 01ff938549.386 │ afa459a2f0.386 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 7.613n ± 1% 7.793n ± 2% +2.38% (p=0.000 n=20) SourceUint64-32 2.069n ± 0% 7.680n ± 1% +271.19% (p=0.000 n=20) GlobalInt64-32 3.456n ± 1% 3.474n ± 3% ~ (p=0.654 n=20) GlobalInt64Parallel-32 0.3252n ± 0% 0.3253n ± 0% ~ (p=0.952 n=20) GlobalUint64-32 3.573n ± 1% 3.433n ± 2% -3.92% (p=0.000 n=20) GlobalUint64Parallel-32 0.3159n ± 0% 0.3156n ± 0% ~ (p=0.223 n=20) Int64-32 2.562n ± 2% 7.707n ± 1% +200.74% (p=0.000 n=20) Uint64-32 2.592n ± 0% 7.714n ± 1% +197.65% (p=0.000 n=20) GlobalIntN1000-32 6.266n ± 2% 6.236n ± 1% ~ (p=0.039 n=20) IntN1000-32 4.724n ± 2% 10.410n ± 1% +120.39% (p=0.000 n=20) Int64N1000-32 5.490n ± 2% 10.975n ± 2% +99.89% (p=0.000 n=20) Int64N1e8-32 5.513n ± 2% 10.980n ± 1% +99.15% (p=0.000 n=20) Int64N1e9-32 5.476n ± 1% 10.950n ± 0% +99.96% (p=0.000 n=20) Int64N2e9-32 5.501n ± 2% 11.110n ± 1% +101.96% (p=0.000 n=20) Int64N1e18-32 9.043n ± 2% 15.180n ± 2% +67.86% (p=0.000 n=20) Int64N2e18-32 9.601n ± 2% 15.610n ± 1% +62.60% (p=0.000 n=20) Int64N4e18-32 12.00n ± 1% 19.23n ± 2% +60.14% (p=0.000 n=20) Int32N1000-32 4.829n ± 2% 10.345n ± 1% +114.25% (p=0.000 n=20) Int32N1e8-32 4.825n ± 2% 10.330n ± 1% +114.09% (p=0.000 n=20) Int32N1e9-32 4.830n ± 2% 10.350n ± 1% +114.26% (p=0.000 n=20) Int32N2e9-32 4.750n ± 2% 10.345n ± 1% +117.81% (p=0.000 n=20) Float32-32 10.89n ± 4% 13.57n ± 1% +24.61% (p=0.000 n=20) Float64-32 19.60n ± 4% 22.95n ± 4% +17.12% (p=0.000 n=20) ExpFloat64-32 12.96n ± 3% 15.23n ± 2% +17.47% (p=0.000 n=20) NormFloat64-32 7.516n ± 1% 13.780n ± 1% +83.34% (p=0.000 n=20) Perm3-32 36.78n ± 2% 46.62n ± 2% +26.72% (p=0.000 n=20) Perm30-32 238.9n ± 2% 400.7n ± 1% +67.73% (p=0.000 n=20) Perm30ViaShuffle-32 189.7n ± 2% 350.5n ± 1% +84.79% (p=0.000 n=20) ShuffleOverhead-32 159.8n ± 1% 326.0n ± 2% +104.01% (p=0.000 n=20) Concurrent-32 3.286n ± 1% 3.290n ± 0% ~ (p=0.743 n=20) On the other hand, compared to the original "update benchmarks" CL, the cleanups we've made more than compensate for PCG being a bit slower than LFSR, at least on 64-bit x86. ARM64 (Apple M1) is a bit slower: perhaps the 64x64→128 multiply is slower there for some reason. 386 is noticeably slower, but it's also a non-SSA backend. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ afa459a2f0.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.555n ± 1% 1.450n ± 3% -6.78% (p=0.000 n=20) GlobalInt64-32 2.071n ± 1% 2.067n ± 2% ~ (p=0.673 n=20) GlobalInt63Parallel-32 0.1023n ± 1% GlobalInt64Parallel-32 0.1044n ± 2% GlobalUint64-32 5.193n ± 1% 2.085n ± 0% -59.86% (p=0.000 n=20) GlobalUint64Parallel-32 0.2341n ± 0% 0.1008n ± 1% -56.93% (p=0.000 n=20) Int64-32 2.056n ± 2% 1.779n ± 1% -13.47% (p=0.000 n=20) Uint64-32 2.077n ± 2% 1.854n ± 2% -10.74% (p=0.000 n=20) GlobalIntN1000-32 4.077n ± 2% 3.140n ± 3% -22.98% (p=0.000 n=20) IntN1000-32 3.476n ± 2% 2.496n ± 1% -28.19% (p=0.000 n=20) Int64N1000-32 3.059n ± 1% 2.510n ± 2% -17.96% (p=0.000 n=20) Int64N1e8-32 2.942n ± 1% 2.471n ± 2% -15.98% (p=0.000 n=20) Int64N1e9-32 2.932n ± 1% 2.488n ± 2% -15.14% (p=0.000 n=20) Int64N2e9-32 2.925n ± 1% 2.478n ± 2% -15.30% (p=0.000 n=20) Int64N1e18-32 3.116n ± 1% 3.088n ± 1% ~ (p=0.013 n=20) Int64N2e18-32 4.067n ± 1% 3.493n ± 1% -14.11% (p=0.000 n=20) Int64N4e18-32 4.054n ± 1% 5.060n ± 2% +24.80% (p=0.000 n=20) Int32N1000-32 2.951n ± 1% 2.620n ± 1% -11.22% (p=0.000 n=20) Int32N1e8-32 3.102n ± 1% 2.652n ± 0% -14.50% (p=0.000 n=20) Int32N1e9-32 3.535n ± 1% 2.644n ± 1% -25.20% (p=0.000 n=20) Int32N2e9-32 3.514n ± 1% 2.619n ± 2% -25.47% (p=0.000 n=20) Float32-32 2.760n ± 1% 2.261n ± 1% -18.06% (p=0.000 n=20) Float64-32 2.284n ± 1% 2.241n ± 2% ~ (p=0.016 n=20) ExpFloat64-32 3.757n ± 1% 3.716n ± 1% ~ (p=0.034 n=20) NormFloat64-32 3.837n ± 1% 3.718n ± 1% -3.09% (p=0.000 n=20) Perm3-32 35.23n ± 2% 34.11n ± 2% -3.19% (p=0.000 n=20) Perm30-32 208.8n ± 1% 200.6n ± 0% -3.93% (p=0.000 n=20) Perm30ViaShuffle-32 111.7n ± 1% 109.7n ± 1% -1.84% (p=0.000 n=20) ShuffleOverhead-32 101.1n ± 1% 107.2n ± 1% +6.03% (p=0.000 n=20) Concurrent-32 2.108n ± 7% 2.108n ± 6% ~ (p=0.644 n=20) PCG_DXSM-32 1.488n ± 2% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 220860f76f.arm64 │ afa459a2f0.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.316n ± 1% 2.531n ± 0% +9.33% (p=0.000 n=20) GlobalInt64-8 2.183n ± 1% 2.177n ± 1% ~ (p=0.533 n=20) GlobalInt63Parallel-8 0.4331n ± 0% GlobalInt64Parallel-8 0.4319n ± 0% GlobalUint64-8 4.377n ± 2% 2.185n ± 1% -50.07% (p=0.000 n=20) GlobalUint64Parallel-8 0.9237n ± 0% 0.4295n ± 1% -53.50% (p=0.000 n=20) Int64-8 2.538n ± 1% 4.104n ± 0% +61.68% (p=0.000 n=20) Uint64-8 2.604n ± 1% 4.080n ± 0% +56.68% (p=0.000 n=20) GlobalIntN1000-8 3.857n ± 2% 2.814n ± 1% -27.04% (p=0.000 n=20) IntN1000-8 3.822n ± 2% 4.140n ± 0% +8.32% (p=0.000 n=20) Int64N1000-8 3.318n ± 0% 4.139n ± 0% +24.74% (p=0.000 n=20) Int64N1e8-8 3.349n ± 1% 4.140n ± 0% +23.64% (p=0.000 n=20) Int64N1e9-8 3.317n ± 2% 4.139n ± 0% +24.80% (p=0.000 n=20) Int64N2e9-8 3.317n ± 2% 4.140n ± 0% +24.81% (p=0.000 n=20) Int64N1e18-8 3.542n ± 1% 5.273n ± 0% +48.85% (p=0.000 n=20) Int64N2e18-8 5.087n ± 0% 6.059n ± 0% +19.12% (p=0.000 n=20) Int64N4e18-8 5.084n ± 0% 8.803n ± 0% +73.16% (p=0.000 n=20) Int32N1000-8 3.208n ± 2% 4.131n ± 0% +28.79% (p=0.000 n=20) Int32N1e8-8 3.610n ± 1% 4.131n ± 0% +14.43% (p=0.000 n=20) Int32N1e9-8 4.235n ± 0% 4.131n ± 0% -2.44% (p=0.000 n=20) Int32N2e9-8 4.229n ± 1% 4.131n ± 0% -2.33% (p=0.000 n=20) Float32-8 3.468n ± 0% 4.110n ± 0% +18.50% (p=0.000 n=20) Float64-8 3.447n ± 0% 4.104n ± 0% +19.05% (p=0.000 n=20) ExpFloat64-8 4.567n ± 0% 5.338n ± 0% +16.86% (p=0.000 n=20) NormFloat64-8 4.821n ± 0% 5.731n ± 0% +18.89% (p=0.000 n=20) Perm3-8 28.89n ± 0% 26.62n ± 0% -7.84% (p=0.000 n=20) Perm30-8 175.7n ± 0% 194.6n ± 2% +10.76% (p=0.000 n=20) Perm30ViaShuffle-8 153.5n ± 0% 156.4n ± 0% +1.86% (p=0.000 n=20) ShuffleOverhead-8 119.8n ± 1% 125.8n ± 0% +4.97% (p=0.000 n=20) Concurrent-8 2.433n ± 3% 2.654n ± 6% +9.13% (p=0.001 n=20) PCG_DXSM-8 2.531n ± 0% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ afa459a2f0.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.370n ± 1% 7.680n ± 1% +224.05% (p=0.000 n=20) GlobalInt64-32 3.569n ± 1% 3.474n ± 3% -2.66% (p=0.001 n=20) GlobalInt63Parallel-32 0.3221n ± 1% GlobalInt64Parallel-32 0.3253n ± 0% GlobalUint64-32 8.797n ± 10% 3.433n ± 2% -60.98% (p=0.000 n=20) GlobalUint64Parallel-32 0.6351n ± 0% 0.3156n ± 0% -50.31% (p=0.000 n=20) Int64-32 2.612n ± 2% 7.707n ± 1% +195.04% (p=0.000 n=20) Uint64-32 3.350n ± 1% 7.714n ± 1% +130.25% (p=0.000 n=20) GlobalIntN1000-32 5.892n ± 1% 6.236n ± 1% +5.82% (p=0.000 n=20) IntN1000-32 4.546n ± 1% 10.410n ± 1% +128.97% (p=0.000 n=20) Int64N1000-32 14.59n ± 1% 10.97n ± 2% -24.75% (p=0.000 n=20) Int64N1e8-32 14.76n ± 2% 10.98n ± 1% -25.58% (p=0.000 n=20) Int64N1e9-32 16.57n ± 1% 10.95n ± 0% -33.90% (p=0.000 n=20) Int64N2e9-32 14.54n ± 1% 11.11n ± 1% -23.62% (p=0.000 n=20) Int64N1e18-32 16.14n ± 1% 15.18n ± 2% -5.95% (p=0.000 n=20) Int64N2e18-32 18.10n ± 1% 15.61n ± 1% -13.73% (p=0.000 n=20) Int64N4e18-32 18.65n ± 1% 19.23n ± 2% +3.08% (p=0.000 n=20) Int32N1000-32 3.560n ± 1% 10.345n ± 1% +190.55% (p=0.000 n=20) Int32N1e8-32 3.770n ± 2% 10.330n ± 1% +174.01% (p=0.000 n=20) Int32N1e9-32 4.098n ± 0% 10.350n ± 1% +152.53% (p=0.000 n=20) Int32N2e9-32 4.179n ± 1% 10.345n ± 1% +147.52% (p=0.000 n=20) Float32-32 21.18n ± 4% 13.57n ± 1% -35.93% (p=0.000 n=20) Float64-32 20.60n ± 2% 22.95n ± 4% +11.41% (p=0.000 n=20) ExpFloat64-32 13.07n ± 0% 15.23n ± 2% +16.48% (p=0.000 n=20) NormFloat64-32 7.738n ± 2% 13.780n ± 1% +78.08% (p=0.000 n=20) Perm3-32 36.73n ± 1% 46.62n ± 2% +26.91% (p=0.000 n=20) Perm30-32 211.9n ± 1% 400.7n ± 1% +89.05% (p=0.000 n=20) Perm30ViaShuffle-32 165.2n ± 1% 350.5n ± 1% +112.20% (p=0.000 n=20) ShuffleOverhead-32 133.9n ± 1% 326.0n ± 2% +143.37% (p=0.000 n=20) Concurrent-32 3.287n ± 2% 3.290n ± 0% ~ (p=0.365 n=20) PCG_DXSM-32 7.793n ± 2% For #61716. Change-Id: I4e9c0525b5f84a2ac46f23da9e365495e2d05777 Reviewed-on: https://go-review.googlesource.com/c/go/+/502506 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Russ Cox
|
8631fcbf31 |
math/rand/v2: add PCG-DXSM
For the original math/rand, we ported Plan 9's random number generator, which was a refinement by Ken Thompson of an algorithm by Don Mitchell and Jim Reeds, which Mitchell in turn recalls as having been derived from an algorithm by Marsaglia. At its core, it is an additive lagged Fibonacci generator (ALFG). Whatever the details of the history, this generator is nowhere near the current state of the art for simple, pseudo-random generators. This CL adds an implementation of Melissa O'Neill's PCG, specifically the variant PCG-DXSM, which she defined after writing the PCG paper and which is now the default in Numpy. The update is slightly slower (a few multiplies and adds, instead of a few adds), but the state is dramatically smaller (2 words instead of 607). The statistical output properties are better too. A followup CL will delete the old generator. PCG is the only change here, so no benchmarks should be affected. Including them anyway as further evidence for caution. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 8993506f2f.amd64 │ 01ff938549.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.325n ± 1% 1.352n ± 1% +2.00% (p=0.000 n=20) GlobalInt64-32 2.240n ± 1% 2.083n ± 0% -7.03% (p=0.000 n=20) GlobalInt64Parallel-32 0.1041n ± 1% 0.1035n ± 1% ~ (p=0.064 n=20) GlobalUint64-32 2.072n ± 3% 2.038n ± 1% ~ (p=0.089 n=20) GlobalUint64Parallel-32 0.1008n ± 1% 0.1006n ± 1% ~ (p=0.804 n=20) Int64-32 1.716n ± 1% 1.687n ± 2% ~ (p=0.045 n=20) Uint64-32 1.665n ± 1% 1.674n ± 2% ~ (p=0.878 n=20) GlobalIntN1000-32 3.335n ± 1% 3.135n ± 1% -6.00% (p=0.000 n=20) IntN1000-32 2.484n ± 1% 2.478n ± 1% ~ (p=0.085 n=20) Int64N1000-32 2.502n ± 2% 2.455n ± 1% -1.88% (p=0.002 n=20) Int64N1e8-32 2.484n ± 2% 2.467n ± 2% ~ (p=0.048 n=20) Int64N1e9-32 2.502n ± 0% 2.454n ± 1% -1.92% (p=0.000 n=20) Int64N2e9-32 2.502n ± 0% 2.482n ± 1% -0.76% (p=0.000 n=20) Int64N1e18-32 3.201n ± 1% 3.349n ± 2% +4.62% (p=0.000 n=20) Int64N2e18-32 3.504n ± 1% 3.537n ± 1% ~ (p=0.185 n=20) Int64N4e18-32 4.873n ± 1% 4.917n ± 0% +0.90% (p=0.000 n=20) Int32N1000-32 2.639n ± 1% 2.386n ± 1% -9.57% (p=0.000 n=20) Int32N1e8-32 2.686n ± 2% 2.366n ± 1% -11.91% (p=0.000 n=20) Int32N1e9-32 2.636n ± 1% 2.355n ± 2% -10.70% (p=0.000 n=20) Int32N2e9-32 2.660n ± 1% 2.371n ± 1% -10.88% (p=0.000 n=20) Float32-32 2.261n ± 1% 2.245n ± 2% ~ (p=0.752 n=20) Float64-32 2.280n ± 1% 2.235n ± 1% -1.97% (p=0.007 n=20) ExpFloat64-32 3.891n ± 1% 3.813n ± 3% ~ (p=0.087 n=20) NormFloat64-32 3.711n ± 1% 3.652n ± 2% ~ (p=0.021 n=20) Perm3-32 32.60n ± 2% 33.12n ± 3% ~ (p=0.107 n=20) Perm30-32 204.2n ± 0% 205.1n ± 1% ~ (p=0.358 n=20) Perm30ViaShuffle-32 121.7n ± 2% 110.8n ± 1% -8.96% (p=0.000 n=20) ShuffleOverhead-32 106.2n ± 2% 113.0n ± 1% +6.36% (p=0.000 n=20) Concurrent-32 2.190n ± 5% 2.100n ± 0% -4.13% (p=0.001 n=20) PCG_DXSM-32 1.490n ± 0% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 8993506f2f.arm64 │ 01ff938549.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.271n ± 0% 2.258n ± 1% ~ (p=0.167 n=20) GlobalInt64-8 2.161n ± 1% 2.167n ± 0% ~ (p=0.693 n=20) GlobalInt64Parallel-8 0.4303n ± 0% 0.4310n ± 0% ~ (p=0.051 n=20) GlobalUint64-8 2.164n ± 1% 2.182n ± 1% ~ (p=0.042 n=20) GlobalUint64Parallel-8 0.4287n ± 0% 0.4297n ± 0% ~ (p=0.082 n=20) Int64-8 2.478n ± 1% 2.472n ± 1% ~ (p=0.151 n=20) Uint64-8 2.460n ± 1% 2.449n ± 1% ~ (p=0.013 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 2% ~ (p=0.821 n=20) IntN1000-8 3.003n ± 2% 2.998n ± 2% ~ (p=0.024 n=20) Int64N1000-8 2.954n ± 0% 2.949n ± 2% ~ (p=0.192 n=20) Int64N1e8-8 2.956n ± 0% 2.953n ± 2% ~ (p=0.109 n=20) Int64N1e9-8 3.325n ± 0% 2.950n ± 0% -11.26% (p=0.000 n=20) Int64N2e9-8 2.956n ± 2% 2.946n ± 2% ~ (p=0.027 n=20) Int64N1e18-8 3.780n ± 1% 3.779n ± 1% ~ (p=0.815 n=20) Int64N2e18-8 4.385n ± 0% 4.370n ± 1% ~ (p=0.402 n=20) Int64N4e18-8 6.527n ± 0% 6.544n ± 1% ~ (p=0.140 n=20) Int32N1000-8 2.964n ± 1% 2.950n ± 0% -0.47% (p=0.002 n=20) Int32N1e8-8 2.964n ± 1% 2.950n ± 2% ~ (p=0.013 n=20) Int32N1e9-8 2.963n ± 2% 2.951n ± 2% ~ (p=0.062 n=20) Int32N2e9-8 2.961n ± 2% 2.950n ± 2% -0.37% (p=0.002 n=20) Float32-8 3.442n ± 0% 3.441n ± 0% ~ (p=0.211 n=20) Float64-8 3.442n ± 0% 3.442n ± 0% ~ (p=0.067 n=20) ExpFloat64-8 4.472n ± 0% 4.481n ± 0% +0.20% (p=0.000 n=20) NormFloat64-8 4.734n ± 0% 4.725n ± 0% -0.19% (p=0.003 n=20) Perm3-8 26.55n ± 0% 26.55n ± 0% ~ (p=0.833 n=20) Perm30-8 181.9n ± 0% 181.9n ± 0% -0.03% (p=0.004 n=20) Perm30ViaShuffle-8 143.1n ± 0% 142.9n ± 0% ~ (p=0.204 n=20) ShuffleOverhead-8 120.6n ± 1% 120.8n ± 2% ~ (p=0.102 n=20) Concurrent-8 2.357n ± 2% 2.421n ± 6% ~ (p=0.016 n=20) PCG_DXSM-8 2.531n ± 0% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 8993506f2f.386 │ 01ff938549.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.102n ± 2% 2.069n ± 0% ~ (p=0.021 n=20) GlobalInt64-32 3.542n ± 2% 3.456n ± 1% -2.44% (p=0.001 n=20) GlobalInt64Parallel-32 0.3202n ± 0% 0.3252n ± 0% +1.56% (p=0.000 n=20) GlobalUint64-32 3.507n ± 1% 3.573n ± 1% +1.87% (p=0.000 n=20) GlobalUint64Parallel-32 0.3170n ± 1% 0.3159n ± 0% ~ (p=0.167 n=20) Int64-32 2.516n ± 1% 2.562n ± 2% ~ (p=0.016 n=20) Uint64-32 2.544n ± 1% 2.592n ± 0% +1.85% (p=0.000 n=20) GlobalIntN1000-32 6.237n ± 1% 6.266n ± 2% ~ (p=0.268 n=20) IntN1000-32 4.670n ± 2% 4.724n ± 2% ~ (p=0.644 n=20) Int64N1000-32 5.412n ± 1% 5.490n ± 2% ~ (p=0.159 n=20) Int64N1e8-32 5.414n ± 2% 5.513n ± 2% ~ (p=0.129 n=20) Int64N1e9-32 5.473n ± 1% 5.476n ± 1% ~ (p=0.723 n=20) Int64N2e9-32 5.487n ± 1% 5.501n ± 2% ~ (p=0.481 n=20) Int64N1e18-32 8.901n ± 2% 9.043n ± 2% ~ (p=0.330 n=20) Int64N2e18-32 9.521n ± 1% 9.601n ± 2% ~ (p=0.703 n=20) Int64N4e18-32 11.92n ± 1% 12.00n ± 1% ~ (p=0.489 n=20) Int32N1000-32 4.785n ± 1% 4.829n ± 2% ~ (p=0.402 n=20) Int32N1e8-32 4.748n ± 1% 4.825n ± 2% ~ (p=0.218 n=20) Int32N1e9-32 4.810n ± 1% 4.830n ± 2% ~ (p=0.794 n=20) Int32N2e9-32 4.812n ± 1% 4.750n ± 2% ~ (p=0.057 n=20) Float32-32 10.48n ± 4% 10.89n ± 4% ~ (p=0.162 n=20) Float64-32 19.79n ± 3% 19.60n ± 4% ~ (p=0.668 n=20) ExpFloat64-32 12.91n ± 3% 12.96n ± 3% ~ (p=1.000 n=20) NormFloat64-32 7.462n ± 1% 7.516n ± 1% ~ (p=0.051 n=20) Perm3-32 35.98n ± 2% 36.78n ± 2% ~ (p=0.033 n=20) Perm30-32 241.5n ± 1% 238.9n ± 2% ~ (p=0.126 n=20) Perm30ViaShuffle-32 187.3n ± 2% 189.7n ± 2% ~ (p=0.387 n=20) ShuffleOverhead-32 160.2n ± 1% 159.8n ± 1% ~ (p=0.256 n=20) Concurrent-32 3.308n ± 3% 3.286n ± 1% ~ (p=0.038 n=20) PCG_DXSM-32 7.613n ± 1% For #61716. Change-Id: Icb274ca1f782504d658305a40159b4ae6a2f3f1d Reviewed-on: https://go-review.googlesource.com/c/go/+/502505 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Rob Pike <r@golang.org> |
||
Russ Cox
|
f2e2637227 |
math/rand/v2: simplify Perm
The compiler says Perm is being inlined into BenchmarkPerm, and yet BenchmarkPerm30ViaShuffle, which you'd think is the same code, still runs significantly faster. The benchmarks are mystifying but this is clearly still a step in the right direction, since BenchmarkPerm30ViaShuffle is still the fastest and we avoid having two copies of that logic. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ e1bbe739fb.amd64 │ 8993506f2f.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.316n ± 2% 1.325n ± 1% ~ (p=0.208 n=20) GlobalInt64-32 2.048n ± 1% 2.240n ± 1% +9.38% (p=0.000 n=20) GlobalInt64Parallel-32 0.1037n ± 1% 0.1041n ± 1% ~ (p=0.774 n=20) GlobalUint64-32 2.039n ± 2% 2.072n ± 3% ~ (p=0.115 n=20) GlobalUint64Parallel-32 0.1013n ± 1% 0.1008n ± 1% ~ (p=0.417 n=20) Int64-32 1.692n ± 2% 1.716n ± 1% ~ (p=0.122 n=20) Uint64-32 1.643n ± 2% 1.665n ± 1% ~ (p=0.062 n=20) GlobalIntN1000-32 3.287n ± 1% 3.335n ± 1% ~ (p=0.147 n=20) IntN1000-32 2.678n ± 2% 2.484n ± 1% -7.24% (p=0.000 n=20) Int64N1000-32 2.684n ± 2% 2.502n ± 2% -6.80% (p=0.000 n=20) Int64N1e8-32 2.663n ± 2% 2.484n ± 2% -6.76% (p=0.000 n=20) Int64N1e9-32 2.633n ± 1% 2.502n ± 0% -4.98% (p=0.000 n=20) Int64N2e9-32 2.657n ± 1% 2.502n ± 0% -5.87% (p=0.000 n=20) Int64N1e18-32 3.125n ± 2% 3.201n ± 1% +2.43% (p=0.000 n=20) Int64N2e18-32 3.476n ± 1% 3.504n ± 1% +0.83% (p=0.009 n=20) Int64N4e18-32 4.795n ± 1% 4.873n ± 1% ~ (p=0.106 n=20) Int32N1000-32 2.485n ± 2% 2.639n ± 1% +6.20% (p=0.000 n=20) Int32N1e8-32 2.457n ± 1% 2.686n ± 2% +9.34% (p=0.000 n=20) Int32N1e9-32 2.452n ± 1% 2.636n ± 1% +7.52% (p=0.000 n=20) Int32N2e9-32 2.453n ± 1% 2.660n ± 1% +8.44% (p=0.000 n=20) Float32-32 2.254n ± 1% 2.261n ± 1% ~ (p=0.888 n=20) Float64-32 2.262n ± 1% 2.280n ± 1% ~ (p=0.040 n=20) ExpFloat64-32 3.777n ± 2% 3.891n ± 1% +3.03% (p=0.000 n=20) NormFloat64-32 3.606n ± 1% 3.711n ± 1% +2.91% (p=0.000 n=20) Perm3-32 33.12n ± 2% 32.60n ± 2% ~ (p=0.045 n=20) Perm30-32 176.1n ± 1% 204.2n ± 0% +15.96% (p=0.000 n=20) Perm30ViaShuffle-32 109.3n ± 1% 121.7n ± 2% +11.30% (p=0.000 n=20) ShuffleOverhead-32 112.5n ± 1% 106.2n ± 2% -5.56% (p=0.000 n=20) Concurrent-32 2.099n ± 0% 2.190n ± 5% +4.36% (p=0.001 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ e1bbe739fb.arm64 │ 8993506f2f.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.290n ± 1% 2.271n ± 0% ~ (p=0.015 n=20) GlobalInt64-8 2.180n ± 1% 2.161n ± 1% ~ (p=0.180 n=20) GlobalInt64Parallel-8 0.4294n ± 0% 0.4303n ± 0% +0.19% (p=0.001 n=20) GlobalUint64-8 2.170n ± 1% 2.164n ± 1% ~ (p=0.673 n=20) GlobalUint64Parallel-8 0.4283n ± 0% 0.4287n ± 0% ~ (p=0.128 n=20) Int64-8 2.481n ± 1% 2.478n ± 1% ~ (p=0.867 n=20) Uint64-8 2.464n ± 1% 2.460n ± 1% ~ (p=0.763 n=20) GlobalIntN1000-8 2.814n ± 0% 2.814n ± 2% ~ (p=0.969 n=20) IntN1000-8 2.934n ± 2% 3.003n ± 2% +2.35% (p=0.000 n=20) Int64N1000-8 2.957n ± 1% 2.954n ± 0% ~ (p=0.285 n=20) Int64N1e8-8 2.935n ± 2% 2.956n ± 0% +0.73% (p=0.002 n=20) Int64N1e9-8 2.935n ± 2% 3.325n ± 0% +13.29% (p=0.000 n=20) Int64N2e9-8 2.933n ± 4% 2.956n ± 2% ~ (p=0.163 n=20) Int64N1e18-8 3.781n ± 1% 3.780n ± 1% ~ (p=0.805 n=20) Int64N2e18-8 4.362n ± 0% 4.385n ± 0% ~ (p=0.077 n=20) Int64N4e18-8 6.576n ± 1% 6.527n ± 0% ~ (p=0.024 n=20) Int32N1000-8 2.942n ± 2% 2.964n ± 1% ~ (p=0.073 n=20) Int32N1e8-8 2.941n ± 1% 2.964n ± 1% ~ (p=0.058 n=20) Int32N1e9-8 2.938n ± 2% 2.963n ± 2% +0.87% (p=0.003 n=20) Int32N2e9-8 2.982n ± 2% 2.961n ± 2% ~ (p=0.056 n=20) Float32-8 3.441n ± 0% 3.442n ± 0% ~ (p=0.030 n=20) Float64-8 3.441n ± 0% 3.442n ± 0% +0.03% (p=0.001 n=20) ExpFloat64-8 4.472n ± 0% 4.472n ± 0% ~ (p=0.877 n=20) NormFloat64-8 4.716n ± 0% 4.734n ± 0% +0.38% (p=0.000 n=20) Perm3-8 26.66n ± 0% 26.55n ± 0% -0.39% (p=0.000 n=20) Perm30-8 143.3n ± 0% 181.9n ± 0% +26.97% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 143.1n ± 0% ~ (p=0.669 n=20) ShuffleOverhead-8 121.1n ± 1% 120.6n ± 1% -0.41% (p=0.004 n=20) Concurrent-8 2.379n ± 2% 2.357n ± 2% ~ (p=0.337 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ e1bbe739fb.386 │ 8993506f2f.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.087n ± 1% 2.102n ± 2% ~ (p=0.507 n=20) GlobalInt64-32 3.538n ± 2% 3.542n ± 2% ~ (p=0.425 n=20) GlobalInt64Parallel-32 0.3207n ± 1% 0.3202n ± 0% ~ (p=0.963 n=20) GlobalUint64-32 3.543n ± 1% 3.507n ± 1% ~ (p=0.034 n=20) GlobalUint64Parallel-32 0.3170n ± 0% 0.3170n ± 1% ~ (p=0.920 n=20) Int64-32 2.548n ± 1% 2.516n ± 1% ~ (p=0.139 n=20) Uint64-32 2.565n ± 2% 2.544n ± 1% ~ (p=0.394 n=20) GlobalIntN1000-32 6.300n ± 1% 6.237n ± 1% ~ (p=0.029 n=20) IntN1000-32 4.750n ± 0% 4.670n ± 2% ~ (p=0.034 n=20) Int64N1000-32 5.515n ± 2% 5.412n ± 1% -1.86% (p=0.009 n=20) Int64N1e8-32 5.527n ± 0% 5.414n ± 2% -2.05% (p=0.002 n=20) Int64N1e9-32 5.531n ± 2% 5.473n ± 1% ~ (p=0.047 n=20) Int64N2e9-32 5.514n ± 2% 5.487n ± 1% ~ (p=0.298 n=20) Int64N1e18-32 9.059n ± 1% 8.901n ± 2% ~ (p=0.037 n=20) Int64N2e18-32 9.594n ± 1% 9.521n ± 1% ~ (p=0.051 n=20) Int64N4e18-32 12.05n ± 2% 11.92n ± 1% ~ (p=0.357 n=20) Int32N1000-32 4.840n ± 2% 4.785n ± 1% ~ (p=0.189 n=20) Int32N1e8-32 4.832n ± 2% 4.748n ± 1% ~ (p=0.042 n=20) Int32N1e9-32 4.815n ± 2% 4.810n ± 1% ~ (p=0.878 n=20) Int32N2e9-32 4.813n ± 1% 4.812n ± 1% ~ (p=0.542 n=20) Float32-32 10.90n ± 2% 10.48n ± 4% -3.85% (p=0.007 n=20) Float64-32 20.32n ± 4% 19.79n ± 3% ~ (p=0.553 n=20) ExpFloat64-32 12.95n ± 3% 12.91n ± 3% ~ (p=0.909 n=20) NormFloat64-32 7.570n ± 1% 7.462n ± 1% -1.44% (p=0.004 n=20) Perm3-32 37.80n ± 2% 35.98n ± 2% -4.79% (p=0.000 n=20) Perm30-32 214.0n ± 1% 241.5n ± 1% +12.85% (p=0.000 n=20) Perm30ViaShuffle-32 188.7n ± 2% 187.3n ± 2% ~ (p=0.029 n=20) ShuffleOverhead-32 160.8n ± 1% 160.2n ± 1% ~ (p=0.180 n=20) Concurrent-32 3.288n ± 0% 3.308n ± 3% ~ (p=0.037 n=20) For #61716. Change-Id: I342b611456c3569520d3c91c849d29eba325d87e Reviewed-on: https://go-review.googlesource.com/c/go/+/502504 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Rob Pike <r@golang.org> |
||
Branden Brown
|
488e2a56b9 |
math/rand/v2: remove bias in ExpFloat64 and NormFloat64
The original implementation of the ziggurat algorithm was designed for 32-bit random integer inputs. This necessitated reusing some low-order bits for the slice selection and the random coordinate, which introduces statistical bias. The result is that PractRand consistently fails the math/rand normal and exponential sequences (transformed to uniform) within 2 GB of variates. This change adjusts the ziggurat procedures to use 63-bit random inputs, so that there is no need to reuse bits between the slice and coordinate. This is sufficient for the normal sequence to survive to 256 GB of PractRand testing. An alternative technique is to recalculate the ziggurats to use 1024 rather than 128 or 256 slices to make full use of 64-bit inputs. This improves the survival of the normal sequence to far beyond 256 GB and additionally provides a 6% performance improvement due to the improved rejection procedure efficiency. However, doing so increases the total size of the ziggurat tables from 4.5 kB to 48 kB. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 2703446c2e.amd64 │ e1bbe739fb.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.337n ± 1% 1.316n ± 2% ~ (p=0.024 n=20) GlobalInt64-32 2.225n ± 2% 2.048n ± 1% -7.93% (p=0.000 n=20) GlobalInt64Parallel-32 0.1043n ± 2% 0.1037n ± 1% ~ (p=0.587 n=20) GlobalUint64-32 2.058n ± 1% 2.039n ± 2% ~ (p=0.030 n=20) GlobalUint64Parallel-32 0.1009n ± 1% 0.1013n ± 1% ~ (p=0.984 n=20) Int64-32 1.719n ± 2% 1.692n ± 2% ~ (p=0.085 n=20) Uint64-32 1.669n ± 1% 1.643n ± 2% ~ (p=0.049 n=20) GlobalIntN1000-32 3.321n ± 2% 3.287n ± 1% ~ (p=0.298 n=20) IntN1000-32 2.479n ± 1% 2.678n ± 2% +8.01% (p=0.000 n=20) Int64N1000-32 2.477n ± 1% 2.684n ± 2% +8.38% (p=0.000 n=20) Int64N1e8-32 2.490n ± 1% 2.663n ± 2% +6.99% (p=0.000 n=20) Int64N1e9-32 2.458n ± 1% 2.633n ± 1% +7.12% (p=0.000 n=20) Int64N2e9-32 2.486n ± 2% 2.657n ± 1% +6.90% (p=0.000 n=20) Int64N1e18-32 3.215n ± 2% 3.125n ± 2% -2.78% (p=0.000 n=20) Int64N2e18-32 3.588n ± 2% 3.476n ± 1% -3.15% (p=0.000 n=20) Int64N4e18-32 4.938n ± 2% 4.795n ± 1% -2.91% (p=0.000 n=20) Int32N1000-32 2.673n ± 2% 2.485n ± 2% -7.02% (p=0.000 n=20) Int32N1e8-32 2.631n ± 2% 2.457n ± 1% -6.63% (p=0.000 n=20) Int32N1e9-32 2.628n ± 2% 2.452n ± 1% -6.70% (p=0.000 n=20) Int32N2e9-32 2.684n ± 2% 2.453n ± 1% -8.61% (p=0.000 n=20) Float32-32 2.240n ± 2% 2.254n ± 1% ~ (p=0.878 n=20) Float64-32 2.253n ± 1% 2.262n ± 1% ~ (p=0.963 n=20) ExpFloat64-32 3.677n ± 1% 3.777n ± 2% +2.71% (p=0.004 n=20) NormFloat64-32 3.761n ± 1% 3.606n ± 1% -4.15% (p=0.000 n=20) Perm3-32 33.55n ± 2% 33.12n ± 2% ~ (p=0.402 n=20) Perm30-32 173.2n ± 1% 176.1n ± 1% +1.67% (p=0.000 n=20) Perm30ViaShuffle-32 115.9n ± 1% 109.3n ± 1% -5.69% (p=0.000 n=20) ShuffleOverhead-32 101.9n ± 1% 112.5n ± 1% +10.35% (p=0.000 n=20) Concurrent-32 2.107n ± 6% 2.099n ± 0% ~ (p=0.051 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 2703446c2e.arm64 │ e1bbe739fb.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.275n ± 0% 2.290n ± 1% ~ (p=0.044 n=20) GlobalInt64-8 2.154n ± 1% 2.180n ± 1% ~ (p=0.068 n=20) GlobalInt64Parallel-8 0.4298n ± 0% 0.4294n ± 0% ~ (p=0.079 n=20) GlobalUint64-8 2.160n ± 1% 2.170n ± 1% ~ (p=0.129 n=20) GlobalUint64Parallel-8 0.4286n ± 0% 0.4283n ± 0% ~ (p=0.350 n=20) Int64-8 2.491n ± 1% 2.481n ± 1% ~ (p=0.330 n=20) Uint64-8 2.458n ± 0% 2.464n ± 1% ~ (p=0.351 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 0% ~ (p=0.325 n=20) IntN1000-8 2.933n ± 0% 2.934n ± 2% ~ (p=0.079 n=20) Int64N1000-8 2.962n ± 1% 2.957n ± 1% ~ (p=0.259 n=20) Int64N1e8-8 2.960n ± 1% 2.935n ± 2% ~ (p=0.276 n=20) Int64N1e9-8 2.935n ± 2% 2.935n ± 2% ~ (p=0.984 n=20) Int64N2e9-8 2.934n ± 0% 2.933n ± 4% ~ (p=0.463 n=20) Int64N1e18-8 3.777n ± 1% 3.781n ± 1% ~ (p=0.516 n=20) Int64N2e18-8 4.359n ± 1% 4.362n ± 0% ~ (p=0.256 n=20) Int64N4e18-8 6.536n ± 1% 6.576n ± 1% ~ (p=0.224 n=20) Int32N1000-8 2.937n ± 0% 2.942n ± 2% ~ (p=0.312 n=20) Int32N1e8-8 2.937n ± 1% 2.941n ± 1% ~ (p=0.463 n=20) Int32N1e9-8 2.936n ± 0% 2.938n ± 2% ~ (p=0.044 n=20) Int32N2e9-8 2.938n ± 2% 2.982n ± 2% ~ (p=0.174 n=20) Float32-8 3.441n ± 0% 3.441n ± 0% ~ (p=0.064 n=20) Float64-8 3.441n ± 0% 3.441n ± 0% ~ (p=0.826 n=20) ExpFloat64-8 4.486n ± 0% 4.472n ± 0% -0.31% (p=0.000 n=20) NormFloat64-8 4.721n ± 0% 4.716n ± 0% ~ (p=0.051 n=20) Perm3-8 26.65n ± 0% 26.66n ± 0% ~ (p=0.080 n=20) Perm30-8 143.2n ± 0% 143.3n ± 0% +0.10% (p=0.000 n=20) Perm30ViaShuffle-8 143.0n ± 0% 142.9n ± 0% ~ (p=0.642 n=20) ShuffleOverhead-8 120.6n ± 1% 121.1n ± 1% +0.41% (p=0.010 n=20) Concurrent-8 2.399n ± 5% 2.379n ± 2% ~ (p=0.365 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 2703446c2e.386 │ e1bbe739fb.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.072n ± 2% 2.087n ± 1% ~ (p=0.440 n=20) GlobalInt64-32 3.546n ± 27% 3.538n ± 2% ~ (p=0.101 n=20) GlobalInt64Parallel-32 0.3211n ± 0% 0.3207n ± 1% ~ (p=0.753 n=20) GlobalUint64-32 3.522n ± 2% 3.543n ± 1% ~ (p=0.071 n=20) GlobalUint64Parallel-32 0.3172n ± 0% 0.3170n ± 0% ~ (p=0.507 n=20) Int64-32 2.520n ± 2% 2.548n ± 1% ~ (p=0.267 n=20) Uint64-32 2.581n ± 1% 2.565n ± 2% ~ (p=0.143 n=20) GlobalIntN1000-32 6.171n ± 1% 6.300n ± 1% ~ (p=0.037 n=20) IntN1000-32 4.752n ± 2% 4.750n ± 0% ~ (p=0.984 n=20) Int64N1000-32 5.429n ± 1% 5.515n ± 2% ~ (p=0.292 n=20) Int64N1e8-32 5.469n ± 2% 5.527n ± 0% ~ (p=0.013 n=20) Int64N1e9-32 5.489n ± 2% 5.531n ± 2% ~ (p=0.256 n=20) Int64N2e9-32 5.492n ± 2% 5.514n ± 2% ~ (p=0.606 n=20) Int64N1e18-32 8.927n ± 1% 9.059n ± 1% ~ (p=0.229 n=20) Int64N2e18-32 9.622n ± 1% 9.594n ± 1% ~ (p=0.703 n=20) Int64N4e18-32 12.03n ± 1% 12.05n ± 2% ~ (p=0.733 n=20) Int32N1000-32 4.817n ± 1% 4.840n ± 2% ~ (p=0.941 n=20) Int32N1e8-32 4.801n ± 1% 4.832n ± 2% ~ (p=0.228 n=20) Int32N1e9-32 4.798n ± 1% 4.815n ± 2% ~ (p=0.560 n=20) Int32N2e9-32 4.840n ± 1% 4.813n ± 1% ~ (p=0.015 n=20) Float32-32 10.51n ± 4% 10.90n ± 2% +3.71% (p=0.007 n=20) Float64-32 20.33n ± 3% 20.32n ± 4% ~ (p=0.566 n=20) ExpFloat64-32 12.59n ± 2% 12.95n ± 3% +2.86% (p=0.002 n=20) NormFloat64-32 7.350n ± 2% 7.570n ± 1% +2.99% (p=0.007 n=20) Perm3-32 39.29n ± 2% 37.80n ± 2% -3.79% (p=0.000 n=20) Perm30-32 219.1n ± 2% 214.0n ± 1% -2.33% (p=0.002 n=20) Perm30ViaShuffle-32 189.8n ± 2% 188.7n ± 2% ~ (p=0.147 n=20) ShuffleOverhead-32 158.9n ± 2% 160.8n ± 1% ~ (p=0.176 n=20) Concurrent-32 3.306n ± 3% 3.288n ± 0% -0.54% (p=0.005 n=20) For #61716. Change-Id: I4c5fe710b310dc075ae21c97d1805bcc20db5050 Reviewed-on: https://go-review.googlesource.com/c/go/+/516275 Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> |
||
Russ Cox
|
ecda959b99 |
math/rand/v2: optimize Float32, Float64
We realized too late after Go 1 that float64(r.Uint64())/(1<<64) is not a correct implementation: it occasionally rounds to 1. The correct implementation is float64(r.Uint64()&(1<<53-1))/(1<<53) but we couldn't change the implementation for compatibility, so we changed it to retry only in the "round to 1" cases. The change to v2 lets us update the algorithm to the simpler, faster one. Note that this implementation cannot generate 2⁻⁵⁴, nor 2⁻¹⁰⁰, nor any of the other numbers between 0 and 2⁻⁵³. A slower algorithm could shift some of the probability of generating these two boundary values over to the values in between, but that would be much slower and not necessarily be better. In particular, the current implementation has the property that there are uniform gaps between the possible returned floats, which might help stability. Also, the result is often scaled and shifted, like Float64()*X+Y. Multiplying by X>1 would open new gaps, and adding most Y would erase all the distinctions that were introduced. The only changes to benchmarks should be in Float32 and Float64. The other changes remain a cautionary tale. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 4d84a369d1.amd64 │ 2703446c2e.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.348n ± 2% 1.337n ± 1% ~ (p=0.662 n=20) GlobalInt64-32 2.082n ± 2% 2.225n ± 2% +6.87% (p=0.000 n=20) GlobalInt64Parallel-32 0.1036n ± 1% 0.1043n ± 2% ~ (p=0.171 n=20) GlobalUint64-32 2.077n ± 2% 2.058n ± 1% ~ (p=0.560 n=20) GlobalUint64Parallel-32 0.1012n ± 1% 0.1009n ± 1% ~ (p=0.995 n=20) Int64-32 1.750n ± 0% 1.719n ± 2% -1.74% (p=0.000 n=20) Uint64-32 1.707n ± 2% 1.669n ± 1% -2.20% (p=0.000 n=20) GlobalIntN1000-32 3.192n ± 1% 3.321n ± 2% +4.04% (p=0.000 n=20) IntN1000-32 2.462n ± 2% 2.479n ± 1% ~ (p=0.417 n=20) Int64N1000-32 2.470n ± 1% 2.477n ± 1% ~ (p=0.664 n=20) Int64N1e8-32 2.503n ± 2% 2.490n ± 1% ~ (p=0.245 n=20) Int64N1e9-32 2.487n ± 1% 2.458n ± 1% ~ (p=0.032 n=20) Int64N2e9-32 2.487n ± 1% 2.486n ± 2% ~ (p=0.507 n=20) Int64N1e18-32 3.006n ± 2% 3.215n ± 2% +6.94% (p=0.000 n=20) Int64N2e18-32 3.368n ± 1% 3.588n ± 2% +6.55% (p=0.000 n=20) Int64N4e18-32 4.763n ± 1% 4.938n ± 2% +3.69% (p=0.000 n=20) Int32N1000-32 2.403n ± 1% 2.673n ± 2% +11.19% (p=0.000 n=20) Int32N1e8-32 2.405n ± 1% 2.631n ± 2% +9.42% (p=0.000 n=20) Int32N1e9-32 2.402n ± 2% 2.628n ± 2% +9.41% (p=0.000 n=20) Int32N2e9-32 2.384n ± 1% 2.684n ± 2% +12.56% (p=0.000 n=20) Float32-32 2.641n ± 2% 2.240n ± 2% -15.18% (p=0.000 n=20) Float64-32 2.483n ± 1% 2.253n ± 1% -9.26% (p=0.000 n=20) ExpFloat64-32 3.486n ± 2% 3.677n ± 1% +5.49% (p=0.000 n=20) NormFloat64-32 3.648n ± 1% 3.761n ± 1% +3.11% (p=0.000 n=20) Perm3-32 33.04n ± 1% 33.55n ± 2% ~ (p=0.180 n=20) Perm30-32 171.9n ± 1% 173.2n ± 1% ~ (p=0.050 n=20) Perm30ViaShuffle-32 100.3n ± 1% 115.9n ± 1% +15.55% (p=0.000 n=20) ShuffleOverhead-32 102.5n ± 1% 101.9n ± 1% ~ (p=0.266 n=20) Concurrent-32 2.101n ± 0% 2.107n ± 6% ~ (p=0.212 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 4d84a369d1.arm64 │ 2703446c2e.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.261n ± 1% 2.275n ± 0% ~ (p=0.082 n=20) GlobalInt64-8 2.160n ± 1% 2.154n ± 1% ~ (p=0.490 n=20) GlobalInt64Parallel-8 0.4299n ± 0% 0.4298n ± 0% ~ (p=0.663 n=20) GlobalUint64-8 2.169n ± 1% 2.160n ± 1% ~ (p=0.292 n=20) GlobalUint64Parallel-8 0.4293n ± 1% 0.4286n ± 0% ~ (p=0.155 n=20) Int64-8 2.473n ± 1% 2.491n ± 1% ~ (p=0.317 n=20) Uint64-8 2.453n ± 1% 2.458n ± 0% ~ (p=0.941 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 2% ~ (p=0.972 n=20) IntN1000-8 2.933n ± 2% 2.933n ± 0% ~ (p=0.287 n=20) Int64N1000-8 2.934n ± 2% 2.962n ± 1% ~ (p=0.062 n=20) Int64N1e8-8 2.935n ± 2% 2.960n ± 1% ~ (p=0.183 n=20) Int64N1e9-8 2.934n ± 2% 2.935n ± 2% ~ (p=0.367 n=20) Int64N2e9-8 2.935n ± 2% 2.934n ± 0% ~ (p=0.455 n=20) Int64N1e18-8 3.778n ± 1% 3.777n ± 1% ~ (p=0.995 n=20) Int64N2e18-8 4.359n ± 1% 4.359n ± 1% ~ (p=0.122 n=20) Int64N4e18-8 6.546n ± 1% 6.536n ± 1% ~ (p=0.920 n=20) Int32N1000-8 2.940n ± 2% 2.937n ± 0% ~ (p=0.149 n=20) Int32N1e8-8 2.937n ± 2% 2.937n ± 1% ~ (p=0.620 n=20) Int32N1e9-8 2.938n ± 0% 2.936n ± 0% ~ (p=0.046 n=20) Int32N2e9-8 2.938n ± 2% 2.938n ± 2% ~ (p=0.455 n=20) Float32-8 3.486n ± 0% 3.441n ± 0% -1.28% (p=0.000 n=20) Float64-8 3.480n ± 0% 3.441n ± 0% -1.13% (p=0.000 n=20) ExpFloat64-8 4.533n ± 0% 4.486n ± 0% -1.03% (p=0.000 n=20) NormFloat64-8 4.764n ± 0% 4.721n ± 0% -0.90% (p=0.000 n=20) Perm3-8 26.66n ± 0% 26.65n ± 0% ~ (p=0.019 n=20) Perm30-8 143.4n ± 0% 143.2n ± 0% -0.17% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 143.0n ± 0% ~ (p=0.522 n=20) ShuffleOverhead-8 120.7n ± 0% 120.6n ± 1% ~ (p=0.488 n=20) Concurrent-8 2.360n ± 2% 2.399n ± 5% ~ (p=0.062 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 4d84a369d1.386 │ 2703446c2e.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.101n ± 2% 2.072n ± 2% ~ (p=0.273 n=20) GlobalInt64-32 3.518n ± 2% 3.546n ± 27% +0.78% (p=0.007 n=20) GlobalInt64Parallel-32 0.3206n ± 0% 0.3211n ± 0% ~ (p=0.386 n=20) GlobalUint64-32 3.538n ± 1% 3.522n ± 2% ~ (p=0.331 n=20) GlobalUint64Parallel-32 0.3231n ± 0% 0.3172n ± 0% -1.84% (p=0.000 n=20) Int64-32 2.554n ± 2% 2.520n ± 2% ~ (p=0.465 n=20) Uint64-32 2.575n ± 2% 2.581n ± 1% ~ (p=0.213 n=20) GlobalIntN1000-32 6.292n ± 1% 6.171n ± 1% ~ (p=0.015 n=20) IntN1000-32 4.735n ± 1% 4.752n ± 2% ~ (p=0.635 n=20) Int64N1000-32 5.489n ± 2% 5.429n ± 1% ~ (p=0.324 n=20) Int64N1e8-32 5.528n ± 2% 5.469n ± 2% ~ (p=0.013 n=20) Int64N1e9-32 5.438n ± 2% 5.489n ± 2% ~ (p=0.984 n=20) Int64N2e9-32 5.474n ± 1% 5.492n ± 2% ~ (p=0.616 n=20) Int64N1e18-32 9.053n ± 1% 8.927n ± 1% ~ (p=0.037 n=20) Int64N2e18-32 9.685n ± 2% 9.622n ± 1% ~ (p=0.449 n=20) Int64N4e18-32 12.18n ± 1% 12.03n ± 1% ~ (p=0.013 n=20) Int32N1000-32 4.862n ± 1% 4.817n ± 1% -0.94% (p=0.002 n=20) Int32N1e8-32 4.758n ± 2% 4.801n ± 1% ~ (p=0.597 n=20) Int32N1e9-32 4.772n ± 1% 4.798n ± 1% ~ (p=0.774 n=20) Int32N2e9-32 4.847n ± 0% 4.840n ± 1% ~ (p=0.867 n=20) Float32-32 22.18n ± 4% 10.51n ± 4% -52.61% (p=0.000 n=20) Float64-32 21.21n ± 3% 20.33n ± 3% -4.17% (p=0.000 n=20) ExpFloat64-32 12.39n ± 2% 12.59n ± 2% ~ (p=0.139 n=20) NormFloat64-32 7.422n ± 1% 7.350n ± 2% ~ (p=0.208 n=20) Perm3-32 38.00n ± 2% 39.29n ± 2% +3.38% (p=0.000 n=20) Perm30-32 212.7n ± 1% 219.1n ± 2% +3.03% (p=0.001 n=20) Perm30ViaShuffle-32 187.5n ± 2% 189.8n ± 2% ~ (p=0.457 n=20) ShuffleOverhead-32 159.7n ± 1% 158.9n ± 2% ~ (p=0.920 n=20) Concurrent-32 3.470n ± 0% 3.306n ± 3% -4.71% (p=0.000 n=20) For #61716. Change-Id: I1933f1f9efd7e6e832d83e7fa5d84398f67d41f5 Reviewed-on: https://go-review.googlesource.com/c/go/+/502503 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Russ Cox
|
c266587846 |
math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N
Now that we can break the value stream, we can take advantage of better algorithms that have been suggested since the original code was written. Also optimizes IntN, Int32N, Int64N, Perm (indirectly). All the N variants (IntN, Int32N, Int64N, UintN, N, etc) now return the same values given a Source and parameter n, so that for example uint(r.IntN(10)) and r.UintN(10) and r.N(uint(10)) are completely interchangeable. Int64N4e18 gets slower but that is a near worst case for the algorithm and is extremely unlikely in practice. 32-bit Int32N variants got slower too, by 15-30%, in exchange for speeding up everything on 64-bit systems and consistency across the N functions. Also rename previously missed benchmark GlobalInt63Parallel to GlobalInt64Parallel. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 11ad9fdddc.amd64 │ 4d84a369d1.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.335n ± 1% 1.348n ± 2% ~ (p=0.335 n=20) GlobalInt64-32 2.046n ± 1% 2.082n ± 2% ~ (p=0.310 n=20) GlobalInt63Parallel-32 0.1037n ± 1% GlobalInt64Parallel-32 0.1036n ± 1% GlobalUint64-32 2.075n ± 0% 2.077n ± 2% ~ (p=0.228 n=20) GlobalUint64Parallel-32 0.1013n ± 1% 0.1012n ± 1% ~ (p=0.878 n=20) Int64-32 1.726n ± 2% 1.750n ± 0% +1.39% (p=0.000 n=20) Uint64-32 1.673n ± 1% 1.707n ± 2% +2.03% (p=0.002 n=20) GlobalIntN1000-32 3.895n ± 2% 3.192n ± 1% -18.05% (p=0.000 n=20) IntN1000-32 3.403n ± 1% 2.462n ± 2% -27.65% (p=0.000 n=20) Int64N1000-32 3.053n ± 2% 2.470n ± 1% -19.11% (p=0.000 n=20) Int64N1e8-32 2.718n ± 1% 2.503n ± 2% -7.91% (p=0.000 n=20) Int64N1e9-32 2.712n ± 1% 2.487n ± 1% -8.31% (p=0.000 n=20) Int64N2e9-32 2.690n ± 1% 2.487n ± 1% -7.57% (p=0.000 n=20) Int64N1e18-32 3.084n ± 2% 3.006n ± 2% -2.53% (p=0.000 n=20) Int64N2e18-32 4.026n ± 1% 3.368n ± 1% -16.33% (p=0.000 n=20) Int64N4e18-32 4.049n ± 2% 4.763n ± 1% +17.62% (p=0.000 n=20) Int32N1000-32 2.730n ± 0% 2.403n ± 1% -11.94% (p=0.000 n=20) Int32N1e8-32 2.916n ± 2% 2.405n ± 1% -17.53% (p=0.000 n=20) Int32N1e9-32 3.375n ± 1% 2.402n ± 2% -28.83% (p=0.000 n=20) Int32N2e9-32 3.292n ± 1% 2.384n ± 1% -27.58% (p=0.000 n=20) Float32-32 2.673n ± 1% 2.641n ± 2% ~ (p=0.147 n=20) Float64-32 2.485n ± 1% 2.483n ± 1% ~ (p=0.804 n=20) ExpFloat64-32 3.577n ± 2% 3.486n ± 2% -2.57% (p=0.000 n=20) NormFloat64-32 3.797n ± 2% 3.648n ± 1% -3.92% (p=0.000 n=20) Perm3-32 35.79n ± 2% 33.04n ± 1% -7.68% (p=0.000 n=20) Perm30-32 205.1n ± 1% 171.9n ± 1% -16.14% (p=0.000 n=20) Perm30ViaShuffle-32 111.2n ± 2% 100.3n ± 1% -9.76% (p=0.000 n=20) ShuffleOverhead-32 100.5n ± 2% 102.5n ± 1% +1.99% (p=0.007 n=20) Concurrent-32 2.188n ± 5% 2.101n ± 0% ~ (p=0.013 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 11ad9fdddc.arm64 │ 4d84a369d1.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.272n ± 1% 2.261n ± 1% ~ (p=0.172 n=20) GlobalInt64-8 2.155n ± 1% 2.160n ± 1% ~ (p=0.482 n=20) GlobalInt63Parallel-8 0.4352n ± 0% GlobalInt64Parallel-8 0.4299n ± 0% GlobalUint64-8 2.173n ± 1% 2.169n ± 1% ~ (p=0.262 n=20) GlobalUint64Parallel-8 0.4340n ± 0% 0.4293n ± 1% -1.08% (p=0.000 n=20) Int64-8 2.544n ± 1% 2.473n ± 1% -2.83% (p=0.000 n=20) Uint64-8 2.552n ± 1% 2.453n ± 1% -3.90% (p=0.000 n=20) GlobalIntN1000-8 3.856n ± 0% 2.814n ± 2% -27.02% (p=0.000 n=20) IntN1000-8 3.820n ± 0% 2.933n ± 2% -23.22% (p=0.000 n=20) Int64N1000-8 3.219n ± 2% 2.934n ± 2% -8.85% (p=0.000 n=20) Int64N1e8-8 3.221n ± 2% 2.935n ± 2% -8.91% (p=0.000 n=20) Int64N1e9-8 3.276n ± 2% 2.934n ± 2% -10.44% (p=0.000 n=20) Int64N2e9-8 3.217n ± 0% 2.935n ± 2% -8.78% (p=0.000 n=20) Int64N1e18-8 3.502n ± 2% 3.778n ± 1% +7.91% (p=0.000 n=20) Int64N2e18-8 4.968n ± 1% 4.359n ± 1% -12.26% (p=0.000 n=20) Int64N4e18-8 4.963n ± 0% 6.546n ± 1% +31.92% (p=0.000 n=20) Int32N1000-8 3.189n ± 1% 2.940n ± 2% -7.81% (p=0.000 n=20) Int32N1e8-8 3.514n ± 1% 2.937n ± 2% -16.41% (p=0.000 n=20) Int32N1e9-8 4.133n ± 0% 2.938n ± 0% -28.91% (p=0.000 n=20) Int32N2e9-8 4.137n ± 0% 2.938n ± 2% -28.97% (p=0.000 n=20) Float32-8 3.468n ± 1% 3.486n ± 0% +0.52% (p=0.000 n=20) Float64-8 3.478n ± 0% 3.480n ± 0% ~ (p=0.063 n=20) ExpFloat64-8 4.563n ± 0% 4.533n ± 0% -0.67% (p=0.000 n=20) NormFloat64-8 4.768n ± 0% 4.764n ± 0% -0.07% (p=0.001 n=20) Perm3-8 28.94n ± 0% 26.66n ± 0% -7.88% (p=0.000 n=20) Perm30-8 175.9n ± 0% 143.4n ± 0% -18.50% (p=0.000 n=20) Perm30ViaShuffle-8 152.6n ± 1% 142.9n ± 0% -6.29% (p=0.000 n=20) ShuffleOverhead-8 119.6n ± 1% 120.7n ± 0% +0.96% (p=0.000 n=20) Concurrent-8 2.452n ± 3% 2.360n ± 2% -3.73% (p=0.007 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 11ad9fdddc.386 │ 4d84a369d1.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.091n ± 1% 2.101n ± 2% ~ (p=0.672 n=20) GlobalInt64-32 3.514n ± 2% 3.518n ± 2% ~ (p=0.723 n=20) GlobalInt63Parallel-32 0.3197n ± 0% GlobalInt64Parallel-32 0.3206n ± 0% GlobalUint64-32 3.542n ± 1% 3.538n ± 1% ~ (p=0.304 n=20) GlobalUint64Parallel-32 0.3218n ± 0% 0.3231n ± 0% ~ (p=0.071 n=20) Int64-32 2.552n ± 2% 2.554n ± 2% ~ (p=0.693 n=20) Uint64-32 2.566n ± 1% 2.575n ± 2% ~ (p=0.606 n=20) GlobalIntN1000-32 5.965n ± 2% 6.292n ± 1% +5.46% (p=0.000 n=20) IntN1000-32 4.652n ± 1% 4.735n ± 1% +1.77% (p=0.000 n=20) Int64N1000-32 14.485n ± 1% 5.489n ± 2% -62.11% (p=0.000 n=20) Int64N1e8-32 14.675n ± 1% 5.528n ± 2% -62.33% (p=0.000 n=20) Int64N1e9-32 16.805n ± 2% 5.438n ± 2% -67.64% (p=0.000 n=20) Int64N2e9-32 14.515n ± 1% 5.474n ± 1% -62.28% (p=0.000 n=20) Int64N1e18-32 16.165n ± 1% 9.053n ± 1% -44.00% (p=0.000 n=20) Int64N2e18-32 17.945n ± 2% 9.685n ± 2% -46.03% (p=0.000 n=20) Int64N4e18-32 18.35n ± 2% 12.18n ± 1% -33.62% (p=0.000 n=20) Int32N1000-32 3.608n ± 1% 4.862n ± 1% +34.77% (p=0.000 n=20) Int32N1e8-32 3.767n ± 1% 4.758n ± 2% +26.31% (p=0.000 n=20) Int32N1e9-32 4.130n ± 2% 4.772n ± 1% +15.54% (p=0.000 n=20) Int32N2e9-32 4.206n ± 1% 4.847n ± 0% +15.24% (p=0.000 n=20) Float32-32 22.18n ± 4% 22.18n ± 4% ~ (p=0.195 n=20) Float64-32 20.75n ± 4% 21.21n ± 3% ~ (p=0.394 n=20) ExpFloat64-32 12.58n ± 3% 12.39n ± 2% ~ (p=0.032 n=20) NormFloat64-32 7.920n ± 3% 7.422n ± 1% -6.29% (p=0.000 n=20) Perm3-32 40.27n ± 1% 38.00n ± 2% -5.65% (p=0.000 n=20) Perm30-32 213.2n ± 2% 212.7n ± 1% ~ (p=0.995 n=20) Perm30ViaShuffle-32 164.2n ± 2% 187.5n ± 2% +14.22% (p=0.000 n=20) ShuffleOverhead-32 134.7n ± 2% 159.7n ± 1% +18.52% (p=0.000 n=20) Concurrent-32 3.301n ± 2% 3.470n ± 0% +5.10% (p=0.000 n=20) For #61716. Change-Id: Id1481b04202883cd0b23e21bb58d1bca4e482bd3 Reviewed-on: https://go-review.googlesource.com/c/go/+/502500 Reviewed-by: Rob Pike <r@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Russ Cox
|
c7dddb02d3 |
math/rand/v2: change Source to use uint64
This should make Uint64-using functions faster and leave other things alone. It is a mystery why so much got faster. A good cautionary tale not to read too much into minor jitter in the benchmarks. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ 11ad9fdddc.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.555n ± 1% 1.335n ± 1% -14.15% (p=0.000 n=20) GlobalInt64-32 2.071n ± 1% 2.046n ± 1% ~ (p=0.016 n=20) GlobalInt63Parallel-32 0.1023n ± 1% 0.1037n ± 1% +1.37% (p=0.002 n=20) GlobalUint64-32 5.193n ± 1% 2.075n ± 0% -60.06% (p=0.000 n=20) GlobalUint64Parallel-32 0.2341n ± 0% 0.1013n ± 1% -56.74% (p=0.000 n=20) Int64-32 2.056n ± 2% 1.726n ± 2% -16.10% (p=0.000 n=20) Uint64-32 2.077n ± 2% 1.673n ± 1% -19.46% (p=0.000 n=20) GlobalIntN1000-32 4.077n ± 2% 3.895n ± 2% -4.45% (p=0.000 n=20) IntN1000-32 3.476n ± 2% 3.403n ± 1% -2.10% (p=0.000 n=20) Int64N1000-32 3.059n ± 1% 3.053n ± 2% ~ (p=0.131 n=20) Int64N1e8-32 2.942n ± 1% 2.718n ± 1% -7.60% (p=0.000 n=20) Int64N1e9-32 2.932n ± 1% 2.712n ± 1% -7.50% (p=0.000 n=20) Int64N2e9-32 2.925n ± 1% 2.690n ± 1% -8.03% (p=0.000 n=20) Int64N1e18-32 3.116n ± 1% 3.084n ± 2% ~ (p=0.425 n=20) Int64N2e18-32 4.067n ± 1% 4.026n ± 1% -1.02% (p=0.007 n=20) Int64N4e18-32 4.054n ± 1% 4.049n ± 2% ~ (p=0.204 n=20) Int32N1000-32 2.951n ± 1% 2.730n ± 0% -7.49% (p=0.000 n=20) Int32N1e8-32 3.102n ± 1% 2.916n ± 2% -6.03% (p=0.000 n=20) Int32N1e9-32 3.535n ± 1% 3.375n ± 1% -4.54% (p=0.000 n=20) Int32N2e9-32 3.514n ± 1% 3.292n ± 1% -6.30% (p=0.000 n=20) Float32-32 2.760n ± 1% 2.673n ± 1% -3.13% (p=0.000 n=20) Float64-32 2.284n ± 1% 2.485n ± 1% +8.80% (p=0.000 n=20) ExpFloat64-32 3.757n ± 1% 3.577n ± 2% -4.78% (p=0.000 n=20) NormFloat64-32 3.837n ± 1% 3.797n ± 2% ~ (p=0.204 n=20) Perm3-32 35.23n ± 2% 35.79n ± 2% ~ (p=0.298 n=20) Perm30-32 208.8n ± 1% 205.1n ± 1% -1.82% (p=0.000 n=20) Perm30ViaShuffle-32 111.7n ± 1% 111.2n ± 2% ~ (p=0.273 n=20) ShuffleOverhead-32 101.1n ± 1% 100.5n ± 2% ~ (p=0.878 n=20) Concurrent-32 2.108n ± 7% 2.188n ± 5% ~ (p=0.417 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ 220860f76f.arm64 │ 11ad9fdddc.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.316n ± 1% 2.272n ± 1% -1.86% (p=0.000 n=20) GlobalInt64-8 2.183n ± 1% 2.155n ± 1% ~ (p=0.122 n=20) GlobalInt63Parallel-8 0.4331n ± 0% 0.4352n ± 0% +0.48% (p=0.000 n=20) GlobalUint64-8 4.377n ± 2% 2.173n ± 1% -50.35% (p=0.000 n=20) GlobalUint64Parallel-8 0.9237n ± 0% 0.4340n ± 0% -53.02% (p=0.000 n=20) Int64-8 2.538n ± 1% 2.544n ± 1% ~ (p=0.189 n=20) Uint64-8 2.604n ± 1% 2.552n ± 1% -1.98% (p=0.000 n=20) GlobalIntN1000-8 3.857n ± 2% 3.856n ± 0% ~ (p=0.051 n=20) IntN1000-8 3.822n ± 2% 3.820n ± 0% -0.05% (p=0.001 n=20) Int64N1000-8 3.318n ± 0% 3.219n ± 2% -2.98% (p=0.000 n=20) Int64N1e8-8 3.349n ± 1% 3.221n ± 2% -3.79% (p=0.000 n=20) Int64N1e9-8 3.317n ± 2% 3.276n ± 2% -1.24% (p=0.001 n=20) Int64N2e9-8 3.317n ± 2% 3.217n ± 0% -3.01% (p=0.000 n=20) Int64N1e18-8 3.542n ± 1% 3.502n ± 2% -1.16% (p=0.001 n=20) Int64N2e18-8 5.087n ± 0% 4.968n ± 1% -2.33% (p=0.000 n=20) Int64N4e18-8 5.084n ± 0% 4.963n ± 0% -2.39% (p=0.000 n=20) Int32N1000-8 3.208n ± 2% 3.189n ± 1% -0.58% (p=0.001 n=20) Int32N1e8-8 3.610n ± 1% 3.514n ± 1% -2.67% (p=0.000 n=20) Int32N1e9-8 4.235n ± 0% 4.133n ± 0% -2.40% (p=0.000 n=20) Int32N2e9-8 4.229n ± 1% 4.137n ± 0% -2.19% (p=0.000 n=20) Float32-8 3.468n ± 0% 3.468n ± 1% ~ (p=0.350 n=20) Float64-8 3.447n ± 0% 3.478n ± 0% +0.90% (p=0.000 n=20) ExpFloat64-8 4.567n ± 0% 4.563n ± 0% -0.10% (p=0.002 n=20) NormFloat64-8 4.821n ± 0% 4.768n ± 0% -1.09% (p=0.000 n=20) Perm3-8 28.89n ± 0% 28.94n ± 0% +0.17% (p=0.000 n=20) Perm30-8 175.7n ± 0% 175.9n ± 0% +0.14% (p=0.000 n=20) Perm30ViaShuffle-8 153.5n ± 0% 152.6n ± 1% ~ (p=0.010 n=20) ShuffleOverhead-8 119.8n ± 1% 119.6n ± 1% ~ (p=0.147 n=20) Concurrent-8 2.433n ± 3% 2.452n ± 3% ~ (p=0.616 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ 11ad9fdddc.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.370n ± 1% 2.091n ± 1% -11.75% (p=0.000 n=20) GlobalInt64-32 3.569n ± 1% 3.514n ± 2% -1.56% (p=0.000 n=20) GlobalInt63Parallel-32 0.3221n ± 1% 0.3197n ± 0% -0.76% (p=0.000 n=20) GlobalUint64-32 8.797n ± 10% 3.542n ± 1% -59.74% (p=0.000 n=20) GlobalUint64Parallel-32 0.6351n ± 0% 0.3218n ± 0% -49.33% (p=0.000 n=20) Int64-32 2.612n ± 2% 2.552n ± 2% -2.30% (p=0.000 n=20) Uint64-32 3.350n ± 1% 2.566n ± 1% -23.42% (p=0.000 n=20) GlobalIntN1000-32 5.892n ± 1% 5.965n ± 2% ~ (p=0.082 n=20) IntN1000-32 4.546n ± 1% 4.652n ± 1% +2.33% (p=0.000 n=20) Int64N1000-32 14.59n ± 1% 14.48n ± 1% ~ (p=0.652 n=20) Int64N1e8-32 14.76n ± 2% 14.67n ± 1% ~ (p=0.836 n=20) Int64N1e9-32 16.57n ± 1% 16.80n ± 2% ~ (p=0.016 n=20) Int64N2e9-32 14.54n ± 1% 14.52n ± 1% ~ (p=0.533 n=20) Int64N1e18-32 16.14n ± 1% 16.16n ± 1% ~ (p=0.606 n=20) Int64N2e18-32 18.10n ± 1% 17.95n ± 2% ~ (p=0.062 n=20) Int64N4e18-32 18.65n ± 1% 18.35n ± 2% -1.61% (p=0.010 n=20) Int32N1000-32 3.560n ± 1% 3.608n ± 1% +1.33% (p=0.001 n=20) Int32N1e8-32 3.770n ± 2% 3.767n ± 1% ~ (p=0.155 n=20) Int32N1e9-32 4.098n ± 0% 4.130n ± 2% ~ (p=0.016 n=20) Int32N2e9-32 4.179n ± 1% 4.206n ± 1% ~ (p=0.011 n=20) Float32-32 21.18n ± 4% 22.18n ± 4% +4.70% (p=0.003 n=20) Float64-32 20.60n ± 2% 20.75n ± 4% +0.73% (p=0.000 n=20) ExpFloat64-32 13.07n ± 0% 12.58n ± 3% -3.82% (p=0.000 n=20) NormFloat64-32 7.738n ± 2% 7.920n ± 3% ~ (p=0.066 n=20) Perm3-32 36.73n ± 1% 40.27n ± 1% +9.65% (p=0.000 n=20) Perm30-32 211.9n ± 1% 213.2n ± 2% ~ (p=0.262 n=20) Perm30ViaShuffle-32 165.2n ± 1% 164.2n ± 2% ~ (p=0.029 n=20) ShuffleOverhead-32 133.9n ± 1% 134.7n ± 2% ~ (p=0.551 n=20) Concurrent-32 3.287n ± 2% 3.301n ± 2% ~ (p=0.330 n=20) For #61716. Change-Id: I8d2f73f87dd3603a0c2ff069988938e0957b6904 Reviewed-on: https://go-review.googlesource.com/c/go/+/502499 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> |
||
Ubuntu
|
8fc043ccfa |
cmd/compile: optimize right shifts of int32 on riscv64
The compiler is currently sign extending 32 bit signed integers to 64 bits before right shifting them using a 64 bit shift instruction. There's no need to do this as RISC-V has instructions for right shifting 32 bit signed values (sraw and sraiw) which sign extend the result of the shift to 64 bits. Change the compiler so that it uses sraw and sraiw for shifts of signed 32 bit integers reducing in most cases the number of instructions needed to perform the shift. Here are some examples of code sequences that are changed by this patch: int32(a) >> 2 before: sll x5,x10,0x20 sra x10,x5,0x22 after: sraw x10,x10,0x2 int32(v) >> int(s) before: sext.w x5,x10 sltiu x6,x11,64 add x6,x6,-1 or x6,x11,x6 sra x10,x5,x6 after: sltiu x5,x11,32 add x5,x5,-1 or x5,x11,x5 sraw x10,x10,x5 int32(v) >> (int(s) & 31) before: sext.w x5,x10 and x6,x11,63 sra x10,x5,x6 after: and x5,x11,31 sraw x10,x10,x5 int32(100) >> int(a) before: bltz x10,<target address calls runtime.panicshift> sltiu x5,x10,64 add x5,x5,-1 or x5,x10,x5 li x6,100 sra x10,x6,x5 after: bltz x10,<target address calls runtime.panicshift> sltiu x5,x10,32 add x5,x5,-1 or x5,x10,x5 li x6,100 sraw x10,x6,x5 int32(v) >> (int(s) & 63) before: sext.w x5,x10 and x6,x11,63 sra x10,x5,x6 after: and x5,x11,63 sltiu x6,x5,32 add x6,x6,-1 or x5,x5,x6 sraw x10,x10,x5 In most cases we eliminate one instruction. In the case where we shift a int32 constant by a variable the number of instructions generated is identical. A sra is simply replaced by a sraw. In the unusual case where we shift right by a variable anded with a constant > 31 but < 64, we generate two additional instructions. As this is an unusual case we do not try to optimize for it. Some improvements can be seen in some of the existing benchmarks, notably in the utf8 package which performs right shifts of runes which are signed 32 bit integers. | utf8-old | utf8-new | | sec/op | sec/op vs base | EncodeASCIIRune-4 17.68n ± 0% 17.67n ± 0% ~ (p=0.312 n=10) EncodeJapaneseRune-4 35.34n ± 0% 34.53n ± 1% -2.31% (p=0.000 n=10) AppendASCIIRune-4 3.213n ± 0% 3.213n ± 0% ~ (p=0.318 n=10) AppendJapaneseRune-4 36.14n ± 0% 35.35n ± 0% -2.19% (p=0.000 n=10) DecodeASCIIRune-4 28.11n ± 0% 27.36n ± 0% -2.69% (p=0.000 n=10) DecodeJapaneseRune-4 38.55n ± 0% 38.58n ± 0% ~ (p=0.612 n=10) Change-Id: I60a91cbede9ce65597571c7b7dd9943eeb8d3cc2 Reviewed-on: https://go-review.googlesource.com/c/go/+/535115 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Joel Sing <joel@sing.id.au> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: M Zhuo <mzh@golangcn.org> Reviewed-by: David Chase <drchase@google.com> |
||
Russ Cox
|
1f4db9dbd6 |
math/rand/v2: update benchmarks
Change the benchmarks to use the result of the calls, as I found that in certain cases inlining resulted in discarding part of the computation in the benchmark loop. Add various benchmarks that will be relevant in future CLs. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ │ sec/op │ SourceUint64-32 1.555n ± 1% GlobalInt64-32 2.071n ± 1% GlobalInt63Parallel-32 0.1023n ± 1% GlobalUint64-32 5.193n ± 1% GlobalUint64Parallel-32 0.2341n ± 0% Int64-32 2.056n ± 2% Uint64-32 2.077n ± 2% GlobalIntN1000-32 4.077n ± 2% IntN1000-32 3.476n ± 2% Int64N1000-32 3.059n ± 1% Int64N1e8-32 2.942n ± 1% Int64N1e9-32 2.932n ± 1% Int64N2e9-32 2.925n ± 1% Int64N1e18-32 3.116n ± 1% Int64N2e18-32 4.067n ± 1% Int64N4e18-32 4.054n ± 1% Int32N1000-32 2.951n ± 1% Int32N1e8-32 3.102n ± 1% Int32N1e9-32 3.535n ± 1% Int32N2e9-32 3.514n ± 1% Float32-32 2.760n ± 1% Float64-32 2.284n ± 1% ExpFloat64-32 3.757n ± 1% NormFloat64-32 3.837n ± 1% Perm3-32 35.23n ± 2% Perm30-32 208.8n ± 1% Perm30ViaShuffle-32 111.7n ± 1% ShuffleOverhead-32 101.1n ± 1% Concurrent-32 2.108n ± 7% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 220860f76f.arm64 │ │ sec/op │ SourceUint64-8 2.316n ± 1% GlobalInt64-8 2.183n ± 1% GlobalInt63Parallel-8 0.4331n ± 0% GlobalUint64-8 4.377n ± 2% GlobalUint64Parallel-8 0.9237n ± 0% Int64-8 2.538n ± 1% Uint64-8 2.604n ± 1% GlobalIntN1000-8 3.857n ± 2% IntN1000-8 3.822n ± 2% Int64N1000-8 3.318n ± 0% Int64N1e8-8 3.349n ± 1% Int64N1e9-8 3.317n ± 2% Int64N2e9-8 3.317n ± 2% Int64N1e18-8 3.542n ± 1% Int64N2e18-8 5.087n ± 0% Int64N4e18-8 5.084n ± 0% Int32N1000-8 3.208n ± 2% Int32N1e8-8 3.610n ± 1% Int32N1e9-8 4.235n ± 0% Int32N2e9-8 4.229n ± 1% Float32-8 3.468n ± 0% Float64-8 3.447n ± 0% ExpFloat64-8 4.567n ± 0% NormFloat64-8 4.821n ± 0% Perm3-8 28.89n ± 0% Perm30-8 175.7n ± 0% Perm30ViaShuffle-8 153.5n ± 0% ShuffleOverhead-8 119.8n ± 1% Concurrent-8 2.433n ± 3% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ │ sec/op │ SourceUint64-32 2.370n ± 1% GlobalInt64-32 3.569n ± 1% GlobalInt63Parallel-32 0.3221n ± 1% GlobalUint64-32 8.797n ± 10% GlobalUint64Parallel-32 0.6351n ± 0% Int64-32 2.612n ± 2% Uint64-32 3.350n ± 1% GlobalIntN1000-32 5.892n ± 1% IntN1000-32 4.546n ± 1% Int64N1000-32 14.59n ± 1% Int64N1e8-32 14.76n ± 2% Int64N1e9-32 16.57n ± 1% Int64N2e9-32 14.54n ± 1% Int64N1e18-32 16.14n ± 1% Int64N2e18-32 18.10n ± 1% Int64N4e18-32 18.65n ± 1% Int32N1000-32 3.560n ± 1% Int32N1e8-32 3.770n ± 2% Int32N1e9-32 4.098n ± 0% Int32N2e9-32 4.179n ± 1% Float32-32 21.18n ± 4% Float64-32 20.60n ± 2% ExpFloat64-32 13.07n ± 0% NormFloat64-32 7.738n ± 2% Perm3-32 36.73n ± 1% Perm30-32 211.9n ± 1% Perm30ViaShuffle-32 165.2n ± 1% ShuffleOverhead-32 133.9n ± 1% Concurrent-32 3.287n ± 2% For #61716. Change-Id: I2f0938eae4b7bf736a8cd899a99783e731bf2179 Reviewed-on: https://go-review.googlesource.com/c/go/+/502496 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Russ Cox
|
1cc5b34d28 |
math/rand/v2: remove Rand.Seed
Removing Rand.Seed lets us remove lockedSource as well, along with the ambiguity in globalRand about which source to use. For #61716. Change-Id: Ibe150520dd1e7dd87165eacaebe9f0c2daeaedfd Reviewed-on: https://go-review.googlesource.com/c/go/+/502498 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> |
||
Russ Cox
|
48bd1fc93b |
math/rand/v2: clean up regression test
Add more test cases. Replace -printgolden with -update, which rewrites the files for us. For #61716. Change-Id: I7c4c900ee896042429135a21971a56ebe16b6a66 Reviewed-on: https://go-review.googlesource.com/c/go/+/516858 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Russ Cox
|
d6c1ef52ad |
math/rand/v2: remove Read
In math/rand, Read is deprecated. Remove in v2. People should use crypto/rand if they need long strings. For #61716. Change-Id: Ib254b7e1844616e96db60a3a7abb572b0dcb1583 Reviewed-on: https://go-review.googlesource.com/c/go/+/502497 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Russ Cox
|
d42750b17c |
math/rand/v2: rename various functions
Int31 -> Int32 Int31n -> Int32N Int63 -> Int64 Int63n -> Int64N Intn -> IntN The 31 and 63 are pedantic and confusing: the functions should be named for the type they return, same as all the others. The lower-case n is inconsistent with Go's usual CamelCase and especially problematic because we plan to add 'func N'. Capitalize the n. For #61716. Change-Id: Idb1a005a82f353677450d47fb612ade7a41fde69 Reviewed-on: https://go-review.googlesource.com/c/go/+/516857 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
Russ Cox
|
59f0ab4036 |
math/rand/v2: start of new API
This is the beginning of the math/rand/v2 package from proposal #61716. Start by copying old API. This CL copies math/rand/* to math/rand/v2 and updates references to math/rand to add v2 throughout. Later CLs will make the v2 changes. For #61716. Change-Id: I1624ccffae3dfa442d4ba2461942decbd076e11b Reviewed-on: https://go-review.googlesource.com/c/go/+/502495 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> |
||
Cherry Mui
|
8c92897e15 |
cmd/compile: rework TestPGOHash to not rebuild dependencies
TestPGOHash may rebuild dependencies as we pass -trimpath to the go command. This CL makes it pass -trimpath compiler flag to only the current package instead, as we only need the current package to have a stable source file path. Also refactor buildPGOInliningTest to only take compiler flags, not go flags, to avoid accidental rebuild. Should fix #63733. Change-Id: Iec6c4e90cf659790e21083ee2e697f518234c5b9 Reviewed-on: https://go-review.googlesource.com/c/go/+/535915 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Bryan Mills <bcmills@google.com> |
||
Cherry Mui
|
5613882df7 |
internal/testenv: use cmd.Environ in CleanCmdEnv
In CleanCmdEnv, use cmd.Environ instead of os.Environ, so it sets the PWD environment variable if cmd.Dir is set. This ensures the child process sees a canonical path for its working directory. Change-Id: Ia769552a488dc909eaf6bb7d21937adba06d1072 Reviewed-on: https://go-review.googlesource.com/c/go/+/538215 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Bryan Mills <bcmills@google.com> |
||
Jes Cok
|
b46aec0765 |
bytes,internal/bytealg: eliminate HashStrBytes,HashStrRevBytes using …
…generics
The logic of HashStrBytes, HashStrRevBytes and HashStr, HashStrRev,
are exactly the same, except that the types are different.
Since the bootstrap toolchain is bumped to 1.20, we can eliminate them
by using generics.
Change-Id: I4336b1cab494ba963f09646c169b45f6b1ee62e3
GitHub-Last-Rev:
|