1
0
mirror of https://github.com/golang/go synced 2024-11-23 08:00:05 -07:00
go/api/next
Russ Cox c266587846 math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N
Now that we can break the value stream, we can take advantage
of better algorithms that have been suggested since the original
code was written.

Also optimizes IntN, Int32N, Int64N, Perm (indirectly).

All the N variants (IntN, Int32N, Int64N, UintN, N, etc) now
return the same values given a Source and parameter n, so that
for example uint(r.IntN(10)) and r.UintN(10) and r.N(uint(10))
are completely interchangeable.

Int64N4e18 gets slower but that is a near worst case for
the algorithm and is extremely unlikely in practice.

32-bit Int32N variants got slower too, by 15-30%, in exchange
for speeding up everything on 64-bit systems and consistency
across the N functions.

Also rename previously missed benchmark
GlobalInt63Parallel to GlobalInt64Parallel.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 11ad9fdddc.amd64 │            4d84a369d1.amd64            │
                        │      sec/op      │    sec/op     vs base                  │
SourceUint64-32                1.335n ± 1%    1.348n ± 2%        ~ (p=0.335 n=20)
GlobalInt64-32                 2.046n ± 1%    2.082n ± 2%        ~ (p=0.310 n=20)
GlobalInt63Parallel-32        0.1037n ± 1%
GlobalInt64Parallel-32                       0.1036n ± 1%
GlobalUint64-32                2.075n ± 0%    2.077n ± 2%        ~ (p=0.228 n=20)
GlobalUint64Parallel-32       0.1013n ± 1%   0.1012n ± 1%        ~ (p=0.878 n=20)
Int64-32                       1.726n ± 2%    1.750n ± 0%   +1.39% (p=0.000 n=20)
Uint64-32                      1.673n ± 1%    1.707n ± 2%   +2.03% (p=0.002 n=20)
GlobalIntN1000-32              3.895n ± 2%    3.192n ± 1%  -18.05% (p=0.000 n=20)
IntN1000-32                    3.403n ± 1%    2.462n ± 2%  -27.65% (p=0.000 n=20)
Int64N1000-32                  3.053n ± 2%    2.470n ± 1%  -19.11% (p=0.000 n=20)
Int64N1e8-32                   2.718n ± 1%    2.503n ± 2%   -7.91% (p=0.000 n=20)
Int64N1e9-32                   2.712n ± 1%    2.487n ± 1%   -8.31% (p=0.000 n=20)
Int64N2e9-32                   2.690n ± 1%    2.487n ± 1%   -7.57% (p=0.000 n=20)
Int64N1e18-32                  3.084n ± 2%    3.006n ± 2%   -2.53% (p=0.000 n=20)
Int64N2e18-32                  4.026n ± 1%    3.368n ± 1%  -16.33% (p=0.000 n=20)
Int64N4e18-32                  4.049n ± 2%    4.763n ± 1%  +17.62% (p=0.000 n=20)
Int32N1000-32                  2.730n ± 0%    2.403n ± 1%  -11.94% (p=0.000 n=20)
Int32N1e8-32                   2.916n ± 2%    2.405n ± 1%  -17.53% (p=0.000 n=20)
Int32N1e9-32                   3.375n ± 1%    2.402n ± 2%  -28.83% (p=0.000 n=20)
Int32N2e9-32                   3.292n ± 1%    2.384n ± 1%  -27.58% (p=0.000 n=20)
Float32-32                     2.673n ± 1%    2.641n ± 2%        ~ (p=0.147 n=20)
Float64-32                     2.485n ± 1%    2.483n ± 1%        ~ (p=0.804 n=20)
ExpFloat64-32                  3.577n ± 2%    3.486n ± 2%   -2.57% (p=0.000 n=20)
NormFloat64-32                 3.797n ± 2%    3.648n ± 1%   -3.92% (p=0.000 n=20)
Perm3-32                       35.79n ± 2%    33.04n ± 1%   -7.68% (p=0.000 n=20)
Perm30-32                      205.1n ± 1%    171.9n ± 1%  -16.14% (p=0.000 n=20)
Perm30ViaShuffle-32            111.2n ± 2%    100.3n ± 1%   -9.76% (p=0.000 n=20)
ShuffleOverhead-32             100.5n ± 2%    102.5n ± 1%   +1.99% (p=0.007 n=20)
Concurrent-32                  2.188n ± 5%    2.101n ± 0%        ~ (p=0.013 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 11ad9fdddc.arm64 │            4d84a369d1.arm64            │
                       │      sec/op      │    sec/op     vs base                  │
SourceUint64-8                2.272n ± 1%    2.261n ± 1%        ~ (p=0.172 n=20)
GlobalInt64-8                 2.155n ± 1%    2.160n ± 1%        ~ (p=0.482 n=20)
GlobalInt63Parallel-8        0.4352n ± 0%
GlobalInt64Parallel-8                       0.4299n ± 0%
GlobalUint64-8                2.173n ± 1%    2.169n ± 1%        ~ (p=0.262 n=20)
GlobalUint64Parallel-8       0.4340n ± 0%   0.4293n ± 1%   -1.08% (p=0.000 n=20)
Int64-8                       2.544n ± 1%    2.473n ± 1%   -2.83% (p=0.000 n=20)
Uint64-8                      2.552n ± 1%    2.453n ± 1%   -3.90% (p=0.000 n=20)
GlobalIntN1000-8              3.856n ± 0%    2.814n ± 2%  -27.02% (p=0.000 n=20)
IntN1000-8                    3.820n ± 0%    2.933n ± 2%  -23.22% (p=0.000 n=20)
Int64N1000-8                  3.219n ± 2%    2.934n ± 2%   -8.85% (p=0.000 n=20)
Int64N1e8-8                   3.221n ± 2%    2.935n ± 2%   -8.91% (p=0.000 n=20)
Int64N1e9-8                   3.276n ± 2%    2.934n ± 2%  -10.44% (p=0.000 n=20)
Int64N2e9-8                   3.217n ± 0%    2.935n ± 2%   -8.78% (p=0.000 n=20)
Int64N1e18-8                  3.502n ± 2%    3.778n ± 1%   +7.91% (p=0.000 n=20)
Int64N2e18-8                  4.968n ± 1%    4.359n ± 1%  -12.26% (p=0.000 n=20)
Int64N4e18-8                  4.963n ± 0%    6.546n ± 1%  +31.92% (p=0.000 n=20)
Int32N1000-8                  3.189n ± 1%    2.940n ± 2%   -7.81% (p=0.000 n=20)
Int32N1e8-8                   3.514n ± 1%    2.937n ± 2%  -16.41% (p=0.000 n=20)
Int32N1e9-8                   4.133n ± 0%    2.938n ± 0%  -28.91% (p=0.000 n=20)
Int32N2e9-8                   4.137n ± 0%    2.938n ± 2%  -28.97% (p=0.000 n=20)
Float32-8                     3.468n ± 1%    3.486n ± 0%   +0.52% (p=0.000 n=20)
Float64-8                     3.478n ± 0%    3.480n ± 0%        ~ (p=0.063 n=20)
ExpFloat64-8                  4.563n ± 0%    4.533n ± 0%   -0.67% (p=0.000 n=20)
NormFloat64-8                 4.768n ± 0%    4.764n ± 0%   -0.07% (p=0.001 n=20)
Perm3-8                       28.94n ± 0%    26.66n ± 0%   -7.88% (p=0.000 n=20)
Perm30-8                      175.9n ± 0%    143.4n ± 0%  -18.50% (p=0.000 n=20)
Perm30ViaShuffle-8            152.6n ± 1%    142.9n ± 0%   -6.29% (p=0.000 n=20)
ShuffleOverhead-8             119.6n ± 1%    120.7n ± 0%   +0.96% (p=0.000 n=20)
Concurrent-8                  2.452n ± 3%    2.360n ± 2%   -3.73% (p=0.007 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 11ad9fdddc.386 │             4d84a369d1.386             │
                        │     sec/op     │    sec/op     vs base                  │
SourceUint64-32              2.091n ± 1%    2.101n ± 2%        ~ (p=0.672 n=20)
GlobalInt64-32               3.514n ± 2%    3.518n ± 2%        ~ (p=0.723 n=20)
GlobalInt63Parallel-32      0.3197n ± 0%
GlobalInt64Parallel-32                     0.3206n ± 0%
GlobalUint64-32              3.542n ± 1%    3.538n ± 1%        ~ (p=0.304 n=20)
GlobalUint64Parallel-32     0.3218n ± 0%   0.3231n ± 0%        ~ (p=0.071 n=20)
Int64-32                     2.552n ± 2%    2.554n ± 2%        ~ (p=0.693 n=20)
Uint64-32                    2.566n ± 1%    2.575n ± 2%        ~ (p=0.606 n=20)
GlobalIntN1000-32            5.965n ± 2%    6.292n ± 1%   +5.46% (p=0.000 n=20)
IntN1000-32                  4.652n ± 1%    4.735n ± 1%   +1.77% (p=0.000 n=20)
Int64N1000-32               14.485n ± 1%    5.489n ± 2%  -62.11% (p=0.000 n=20)
Int64N1e8-32                14.675n ± 1%    5.528n ± 2%  -62.33% (p=0.000 n=20)
Int64N1e9-32                16.805n ± 2%    5.438n ± 2%  -67.64% (p=0.000 n=20)
Int64N2e9-32                14.515n ± 1%    5.474n ± 1%  -62.28% (p=0.000 n=20)
Int64N1e18-32               16.165n ± 1%    9.053n ± 1%  -44.00% (p=0.000 n=20)
Int64N2e18-32               17.945n ± 2%    9.685n ± 2%  -46.03% (p=0.000 n=20)
Int64N4e18-32                18.35n ± 2%    12.18n ± 1%  -33.62% (p=0.000 n=20)
Int32N1000-32                3.608n ± 1%    4.862n ± 1%  +34.77% (p=0.000 n=20)
Int32N1e8-32                 3.767n ± 1%    4.758n ± 2%  +26.31% (p=0.000 n=20)
Int32N1e9-32                 4.130n ± 2%    4.772n ± 1%  +15.54% (p=0.000 n=20)
Int32N2e9-32                 4.206n ± 1%    4.847n ± 0%  +15.24% (p=0.000 n=20)
Float32-32                   22.18n ± 4%    22.18n ± 4%        ~ (p=0.195 n=20)
Float64-32                   20.75n ± 4%    21.21n ± 3%        ~ (p=0.394 n=20)
ExpFloat64-32                12.58n ± 3%    12.39n ± 2%        ~ (p=0.032 n=20)
NormFloat64-32               7.920n ± 3%    7.422n ± 1%   -6.29% (p=0.000 n=20)
Perm3-32                     40.27n ± 1%    38.00n ± 2%   -5.65% (p=0.000 n=20)
Perm30-32                    213.2n ± 2%    212.7n ± 1%        ~ (p=0.995 n=20)
Perm30ViaShuffle-32          164.2n ± 2%    187.5n ± 2%  +14.22% (p=0.000 n=20)
ShuffleOverhead-32           134.7n ± 2%    159.7n ± 1%  +18.52% (p=0.000 n=20)
Concurrent-32                3.301n ± 2%    3.470n ± 0%   +5.10% (p=0.000 n=20)

For #61716.

Change-Id: Id1481b04202883cd0b23e21bb58d1bca4e482bd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/502500
Reviewed-by: Rob Pike <r@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30 17:08:37 +00:00
..
50102.txt archive/tar: add FileInfoNames interface 2023-08-07 00:25:25 +00:00
51246.txt syscall: add support to get pidfd from ForkExec on Linux 2023-09-07 19:11:15 +00:00
51971.txt net/http: add ServeFileFS, FileServerFS, NewFileTransportFS 2023-08-07 17:07:12 +00:00
53693.txt encoding: add AppendEncode and AppendDecode 2023-08-17 16:23:42 +00:00
54898.txt archive/zip: add AddFS method to zip Writer 2023-07-31 18:10:41 +00:00
56353.txt slices: add Concat 2023-08-08 14:31:39 +00:00
58000.txt archive/tar: add AddFS method to Writer 2023-09-13 13:59:01 +00:00
59599.txt reflect: deprecate PtrTo 2023-07-27 23:28:51 +00:00
60061.txt go/ast: add Unparen(Expr) helper 2023-07-20 20:19:51 +00:00
60088.txt reflect: add TypeFor 2023-07-31 15:50:15 +00:00
60204.txt cmp: add Or 2023-09-25 14:30:20 +00:00
60370.txt database/sql: add Null[T] 2023-08-07 14:26:37 +00:00
61410.txt net/http: implement path value methods on Request 2023-09-14 00:00:28 +00:00
61619.txt html/template: support parsing complex JS template literals 2023-10-02 15:18:39 +00:00
61642.txt net/netip: add AddrPort.Compare and Prefix.Compare 2023-09-11 20:26:41 +00:00
61716.txt math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N 2023-10-30 17:08:37 +00:00
61758.txt testing/slogtest: add Run to run cases as subtests 2023-09-08 20:37:12 +00:00
61870.txt io: add (*SectionReader).Outer() 2023-10-09 22:04:41 +00:00
61974.txt debug/elf,cmd/link: add additional MIPS64 relocation type 2023-09-08 04:32:10 +00:00