1
0
mirror of https://github.com/golang/go synced 2024-11-26 06:27:58 -07:00
Commit Graph

587 Commits

Author SHA1 Message Date
tkawakita
3463852b76 math/big: fix typo of comment (BytesScanner to ByteScanner)
Change-Id: I0c2d26d6ede1452008992efbea7392162da65014
Reviewed-on: https://go-review.googlesource.com/c/go/+/331651
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-06-29 16:57:13 +00:00
Russ Cox
a294e4e798 math/rand: mention half-open intervals explicitly
If someone sees "in [0,n)" it might look like a typo.
Saying "in the half-open interval [0,n)" will give people
something to search the web for (half-open interval).

Change-Id: I3c343f0a7171891e106e709ca77ab9db5daa5c84
Reviewed-on: https://go-review.googlesource.com/c/go/+/328210
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
2021-06-16 16:38:43 +00:00
Russ Cox
139e935d3c math/big: comment division
The comments in the code refer to Knuth and to Burnikel and Ziegler,
but Knuth's presentation is inscrutable, and our recursive division
code does not bear much resemblance to Burnikel and Ziegler's paper
(which is fine, ours is nicer).

Add a standalone explanation of division instead of referring to
difficult or not-directly-used references.

Change-Id: Ic1b35dc167fb29a69ee00e0b4a768ac9cc9e1324
Reviewed-on: https://go-review.googlesource.com/c/go/+/321078
Trust: Russ Cox <rsc@golang.org>
Trust: Katie Hockman <katie@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
2021-06-09 15:09:13 +00:00
Russ Cox
e4615ad74d math/big: move division into natdiv.go
Code moved and functions reordered to be in a consistent
top-down dependency order, but otherwise unchanged.

First step toward commenting division algorithms.

Change-Id: Ib5e604fb5b2867edff3a228ba4e57b5cb32c4137
Reviewed-on: https://go-review.googlesource.com/c/go/+/321077
Trust: Russ Cox <rsc@golang.org>
Trust: Katie Hockman <katie@golang.org>
Trust: Robert Griesemer <gri@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2021-05-26 13:25:43 +00:00
Tobias Klauser
2c76a6f7f8 all: add //go:build lines to assembly files
Don't add them to files in vendor and cmd/vendor though. These will be
pulled in by updating the respective dependencies.

For #41184

Change-Id: Icc57458c9b3033c347124323f33084c85b224c70
Reviewed-on: https://go-review.googlesource.com/c/go/+/319389
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2021-05-13 09:12:17 +00:00
Robert Griesemer
6c591f79b0 math/big: check for excessive exponents in Rat.SetString
Found by oss-fuzz https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=33284

Fixes #45910.

Change-Id: I61e7b04dbd80343420b57eede439e361c0f7b79c
Reviewed-on: https://go-review.googlesource.com/c/go/+/316149
Trust: Robert Griesemer <gri@golang.org>
Trust: Katie Hockman <katie@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
2021-05-06 16:00:55 +00:00
Colin Arnott
e8eb1d8269 math: add MaxUint, MinInt, MaxInt
Since we have int8 to int64 min max and uint8 to uint64 max constants,
we should probably have some for the word size types too. This change
also adds tests to validate the correctness of all integer limit
values.

Fixes #28538

Change-Id: Idd25782e98d16c2abedf39959b7b66e9c4c0c98b
Reviewed-on: https://go-review.googlesource.com/c/go/+/247058
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Robert Griesemer <gri@golang.org>
2021-05-03 22:44:33 +00:00
Robert Griesemer
7b768d43d0 math: replace float32/64 extrema with exact expressions
Follow-up on https://golang.org/cl/315170.

Updates #44057.
Updates #44058.

Change-Id: I0b071e8ee7a1c97aae2436945cc9583cde3b40b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/315969
Trust: Robert Griesemer <gri@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2021-05-03 16:23:09 +00:00
Robert Griesemer
3498027329 math: increase precision of math.SmallestNonzeroFloat64
The original value was rounded too early, which lead to the
surprising behavior that float64(math.SmallestNonzeroFloat64 / 2)
wasn't 0. That is, the exact compile-time computation of
math.SmallestNonzeroFloat64 / 2 resulted in a value that was
rounded up when converting to float64. To address this, added 3
more digits to the mantissa, ending in a 0.

While at it, also slightly increased the precision of MaxFloat64
to end in a 0.

Computed exact values via https://play.golang.org/p/yt4KTpIx_wP.

Added a test to verify expected behavior.

In contrast to the other (irrational) constants, expanding these
extreme values to more digits is unlikely to be important as they
are not going to appear in numeric computations except for tests
verifying their correctness (as is the case here).

Re-enabled a disabled test in go/types and types2.

Updates #44057.
Fixes #44058.

Change-Id: I8f363155e02331354e929beabe993c8d8de75646
Reviewed-on: https://go-review.googlesource.com/c/go/+/315170
Trust: Robert Griesemer <gri@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-04-30 00:13:38 +00:00
Paschalis Tsilias
cbb3f09047 testing: add -shuffle=off|on|N to alter the execution order of tests and benchmarks
This CL adds a new flag to the testing package and the go test command
which randomizes the execution order for tests and benchmarks.
This can be useful for identifying unwanted dependencies
between test or benchmark functions.
The flag is off by default. If `-shuffle` is set to `on` then the system
clock will be used as the seed value. If `-shuffle` is set to an integer
N, then N will be used as the seed value. In both cases, the seed will
be reported for failed runs so that they can reproduced later on.

Fixes #28592

Change-Id: I62e7dfae5f63f97a0cbd7830ea844d9f7beac335
Reviewed-on: https://go-review.googlesource.com/c/go/+/310033
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Trust: Bryan C. Mills <bcmills@google.com>
2021-04-28 16:06:21 +00:00
yangwenmai
d553c0144d bits: use same expression with system bit size
Change-Id: Ibce07f8f36f7c64f7022ce656f8efbec5dff3f82
Reviewed-on: https://go-review.googlesource.com/c/go/+/313829
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Robert Griesemer <gri@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
2021-04-27 16:25:40 +00:00
Filippo Valsorda
d2f96f2f75 math/rand: make the security warning clearer and more prominent
It is still a common misconception that math/rand can be used for
security-sensitive work if seeded with crypto/rand
(lazyledger/lazyledger-core#270). It can not.

Change-Id: I8598c352d1750eabeada50be9976ab68cbb42cc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/310350
Trust: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
2021-04-23 17:50:33 +00:00
Yury Smolsky
4f5aec4603 all: remove redundant spaces before . and ,
Change-Id: I6a4bd2544276d0638bddf07ebcf2ee636db30fea
Reviewed-on: https://go-review.googlesource.com/c/go/+/311009
Run-TryBot: Yury Smolsky <yury@smolsky.by>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
2021-04-20 00:49:17 +00:00
Austin Clements
1d20a362d0 math: avoid assembly stubs
Currently almost all math functions have the following pattern:

func Sin(x float64) float64

func sin(x float64) float64 {
    // ... pure Go implementation ...
}

Architectures that implement a function in assembly provide the
assembly implementation directly as the exported function (e.g., Sin),
and architectures that don't implement it in assembly use a small stub
to jump back to the Go code, like:

TEXT ·Sin(SB), NOSPLIT, $0
	JMP ·sin(SB)

However, most functions are not implemented in assembly on most
architectures, so this jump through assembly is a waste. It defeats
compiler optimizations like inlining. And, with regabi, it actually
adds a small but non-trivial overhead because the jump from assembly
back to Go must go through an ABI0->ABIInternal bridge function.

Hence, this CL reorganizes this structure across the entire package.
It now leans on inlining to achieve peak performance, but allows the
compiler to see all the way through the pure Go implementation.

Now, functions follow this pattern:

func Sin(x float64) float64 {
	if haveArchSin {
		return archSin(x)
	}
	return sin(x)
}

func sin(x float64) float64 {
    // ... pure Go implementation ...
}

Architectures that have assembly implementations use build-tagged
files to set haveArchX to true an provide an archX implementation.
That implementation can also still call back into the Go
implementation (some of them do this).

Prior to this change, enabling ABI wrappers results in a geomean
slowdown of the math benchmarks of 8.77% (full results:
https://perf.golang.org/search?q=upload:20210415.6) and of the Tile38
benchmarks by ~4%. After this change, enabling ABI wrappers is
completely performance-neutral on Tile38 and all but one math
benchmark (full results:
https://perf.golang.org/search?q=upload:20210415.7). ABI wrappers slow
down SqrtIndirectLatency-12 by 2.09%, which makes sense because that
call must still go through an ABI wrapper.

With ABI wrappers disabled (which won't be an option on amd64 much
longer), on linux/amd64, this change is largely performance-neutral
and slightly improves the performance of a few benchmarks:

(Because there are so many benchmarks, I've applied the Šidák
correction to the alpha threshold. It makes relatively little
difference in which benchmarks are statistically significant.)

name                    old time/op  new time/op  delta
Acos-12                 22.3ns ± 0%  18.8ns ± 1%  -15.44%  (p=0.000 n=18+16)
Acosh-12                28.2ns ± 0%  28.2ns ± 0%     ~     (p=0.404 n=18+20)
Asin-12                 18.1ns ± 0%  18.2ns ± 0%   +0.20%  (p=0.000 n=18+16)
Asinh-12                32.8ns ± 0%  32.9ns ± 1%     ~     (p=0.891 n=18+20)
Atan-12                 9.92ns ± 0%  9.90ns ± 1%   -0.24%  (p=0.000 n=17+16)
Atanh-12                27.7ns ± 0%  27.5ns ± 0%   -0.72%  (p=0.000 n=16+20)
Atan2-12                18.5ns ± 0%  18.4ns ± 0%   -0.59%  (p=0.000 n=19+19)
Cbrt-12                 22.1ns ± 0%  22.1ns ± 0%     ~     (p=0.804 n=16+17)
Ceil-12                 0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.663 n=18+16)
Copysign-12             0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.762 n=16+19)
Cos-12                  12.7ns ± 0%  12.7ns ± 1%     ~     (p=0.145 n=19+18)
Cosh-12                 22.2ns ± 0%  22.5ns ± 0%   +1.60%  (p=0.000 n=17+19)
Erf-12                  11.1ns ± 1%  11.1ns ± 1%     ~     (p=0.010 n=19+19)
Erfc-12                 12.6ns ± 1%  12.7ns ± 0%     ~     (p=0.066 n=19+15)
Erfinv-12               16.1ns ± 0%  16.1ns ± 0%     ~     (p=0.462 n=17+20)
Erfcinv-12              16.0ns ± 1%  16.0ns ± 1%     ~     (p=0.015 n=17+16)
Exp-12                  16.3ns ± 0%  16.5ns ± 1%   +1.25%  (p=0.000 n=19+16)
ExpGo-12                36.2ns ± 1%  36.1ns ± 1%     ~     (p=0.242 n=20+18)
Expm1-12                18.6ns ± 0%  18.7ns ± 0%   +0.25%  (p=0.000 n=16+19)
Exp2-12                 34.7ns ± 0%  34.6ns ± 1%     ~     (p=0.010 n=19+18)
Exp2Go-12               34.8ns ± 1%  34.8ns ± 1%     ~     (p=0.372 n=19+19)
Abs-12                  0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.766 n=18+16)
Dim-12                  0.84ns ± 1%  0.84ns ± 1%     ~     (p=0.167 n=17+19)
Floor-12                0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.993 n=18+16)
Max-12                  3.35ns ± 0%  3.35ns ± 0%     ~     (p=0.894 n=17+19)
Min-12                  3.35ns ± 0%  3.36ns ± 1%     ~     (p=0.214 n=18+18)
Mod-12                  35.2ns ± 0%  34.7ns ± 0%   -1.45%  (p=0.000 n=18+17)
Frexp-12                5.31ns ± 0%  4.75ns ± 0%  -10.51%  (p=0.000 n=19+18)
Gamma-12                14.8ns ± 0%  16.2ns ± 1%   +9.21%  (p=0.000 n=20+19)
Hypot-12                6.16ns ± 0%  6.17ns ± 0%   +0.26%  (p=0.000 n=19+20)
HypotGo-12              7.79ns ± 1%  7.78ns ± 0%     ~     (p=0.497 n=18+17)
Ilogb-12                4.47ns ± 0%  4.47ns ± 0%     ~     (p=0.167 n=19+19)
J0-12                   76.0ns ± 0%  76.3ns ± 0%   +0.35%  (p=0.000 n=19+18)
J1-12                   76.8ns ± 1%  75.9ns ± 0%   -1.14%  (p=0.000 n=18+18)
Jn-12                    167ns ± 1%   168ns ± 1%     ~     (p=0.038 n=18+18)
Ldexp-12                6.98ns ± 0%  6.43ns ± 0%   -7.97%  (p=0.000 n=17+18)
Lgamma-12               15.9ns ± 0%  16.0ns ± 1%     ~     (p=0.011 n=20+17)
Log-12                  13.3ns ± 0%  13.4ns ± 1%   +0.37%  (p=0.000 n=15+18)
Logb-12                 4.75ns ± 0%  4.75ns ± 0%     ~     (p=0.831 n=16+18)
Log1p-12                19.5ns ± 0%  19.5ns ± 1%     ~     (p=0.851 n=18+17)
Log10-12                15.9ns ± 0%  14.0ns ± 0%  -11.92%  (p=0.000 n=17+16)
Log2-12                 7.88ns ± 1%  8.01ns ± 0%   +1.72%  (p=0.000 n=20+20)
Modf-12                 4.75ns ± 0%  4.34ns ± 0%   -8.66%  (p=0.000 n=19+17)
Nextafter32-12          5.31ns ± 0%  5.31ns ± 0%     ~     (p=0.389 n=17+18)
Nextafter64-12          5.03ns ± 1%  5.03ns ± 0%     ~     (p=0.774 n=17+18)
PowInt-12               29.9ns ± 0%  28.5ns ± 0%   -4.69%  (p=0.000 n=18+19)
PowFrac-12              91.0ns ± 0%  91.1ns ± 0%     ~     (p=0.029 n=19+19)
Pow10Pos-12             1.12ns ± 0%  1.12ns ± 0%     ~     (p=0.363 n=20+20)
Pow10Neg-12             3.90ns ± 0%  3.90ns ± 0%     ~     (p=0.921 n=17+18)
Round-12                2.31ns ± 0%  2.31ns ± 1%     ~     (p=0.390 n=18+18)
RoundToEven-12          0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.280 n=18+19)
Remainder-12            31.6ns ± 0%  29.6ns ± 0%   -6.16%  (p=0.000 n=18+17)
Signbit-12              0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.385 n=19+18)
Sin-12                  12.5ns ± 0%  12.5ns ± 0%     ~     (p=0.080 n=18+18)
Sincos-12               16.4ns ± 2%  16.4ns ± 2%     ~     (p=0.253 n=20+19)
Sinh-12                 26.1ns ± 0%  26.1ns ± 0%   +0.18%  (p=0.000 n=17+19)
SqrtIndirect-12         3.91ns ± 0%  3.90ns ± 0%     ~     (p=0.133 n=19+19)
SqrtLatency-12          2.79ns ± 0%  2.79ns ± 0%     ~     (p=0.226 n=16+19)
SqrtIndirectLatency-12  6.68ns ± 0%  6.37ns ± 2%   -4.66%  (p=0.000 n=17+20)
SqrtGoLatency-12        49.4ns ± 0%  49.4ns ± 0%     ~     (p=0.289 n=18+16)
SqrtPrime-12            3.18µs ± 0%  3.18µs ± 0%     ~     (p=0.084 n=17+18)
Tan-12                  13.8ns ± 0%  13.9ns ± 2%     ~     (p=0.292 n=19+20)
Tanh-12                 25.4ns ± 0%  25.4ns ± 0%     ~     (p=0.101 n=17+17)
Trunc-12                0.84ns ± 0%  0.84ns ± 0%     ~     (p=0.765 n=18+16)
Y0-12                   75.8ns ± 0%  75.9ns ± 1%     ~     (p=0.805 n=16+18)
Y1-12                   76.3ns ± 0%  75.3ns ± 1%   -1.34%  (p=0.000 n=19+17)
Yn-12                    164ns ± 0%   164ns ± 2%     ~     (p=0.356 n=18+20)
Float64bits-12          0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.383 n=18+18)
Float64frombits-12      0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.066 n=18+19)
Float32bits-12          0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.889 n=16+19)
Float32frombits-12      0.56ns ± 0%  0.56ns ± 0%     ~     (p=0.007 n=18+19)
FMA-12                  23.9ns ± 0%  24.0ns ± 0%   +0.31%  (p=0.000 n=16+17)
[Geo mean]              9.86ns       9.77ns        -0.87%

(https://perf.golang.org/search?q=upload:20210415.5)

For #40724.

Change-Id: I44fbba2a17be930ec9daeb0a8222f55cd50555a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/310331
Trust: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-04-15 15:48:19 +00:00
smasher164
6f90ee36e9 math: simplify comparison in FMA when swapping p and z
Discovered by Junchen Li on CL 246858, the comparison before p and z are
swapped can be simplified from

    pe < ze || (pe == ze && (pm1 < zm1 || (pm1 == zm1 && pm2 < zm2)))

to

    pe < ze || pe == ze && pm1 < zm1

because zm2 is initialized to 0 before the branch.

Change-Id: Iee92d570038df2b0f8941ef6e422a022654ab2d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/247241
Run-TryBot: Akhil Indurti <aindurti@gmail.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-03-29 06:45:05 +00:00
Cherry Zhang
0e31de280f math/big: don't require runtime.(*Frame).Next symbol present
I don't know why the test requires runtime.(*Frame).Next symbol
present in the binary under test. I assume it is just some
sanity check? With CL 268479 runtime.(*Frame).Next can be pruned
by the linker. Replace it with runtime.main which should always
be present.

May fix the longtest builders.

Change-Id: Id3104c058b2786057ff58be41b1d35aeac2f3073
Reviewed-on: https://go-review.googlesource.com/c/go/+/304431
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-03-24 15:51:26 +00:00
Meng Zhuo
119d76d98e math/bits: folded reverse tables by using const string
linux/mips64le

name            old time/op  new time/op  delta
Reverse         2.31ns ± 1%  2.27ns ± 1%  -1.53%  (p=0.001 n=10+10)
Reverse8        0.65ns ± 1%  0.65ns ± 1%  -1.19%  (p=0.000 n=9+10)
Reverse16       1.15ns ± 2%  1.14ns ± 2%    ~     (p=0.062 n=9+10)
Reverse32       1.96ns ± 1%  1.94ns ± 1%  -1.16%  (p=0.000 n=10+9)
Reverse64       2.29ns ± 1%  2.26ns ± 0%  -0.94%  (p=0.000 n=9+9)
ReverseBytes    0.66ns ± 3%  0.65ns ± 1%  -1.58%  (p=0.006 n=9+10)
ReverseBytes16  0.66ns ± 2%  0.65ns ± 1%  -2.05%  (p=0.000 n=10+9)
ReverseBytes32  0.41ns ± 1%  0.40ns ± 0%  -1.68%  (p=0.000 n=10+10)
ReverseBytes64  0.66ns ± 1%  0.65ns ± 1%  -1.50%  (p=0.000 n=10+9)

cpu=1 benchtime=100ms count=100

name            old time/op  new time/op  delta
Reverse         28.0ns ± 3%  27.7ns ± 3%   -0.80%  (p=0.000 n=100+98)
Reverse8        2.24ns ± 1%  2.24ns ± 1%     ~     (p=0.142 n=98+100)
Reverse16       4.07ns ± 3%  4.05ns ± 3%   -0.66%  (p=0.000 n=99+99)
Reverse32       11.3ns ± 0%  11.3ns ± 0%     ~     (p=0.283 n=94+97)
Reverse64       12.6ns ± 0%  12.6ns ± 0%   +0.60%  (p=0.000 n=100+98)
ReverseBytes    5.25ns ± 1%  5.24ns ± 1%   -0.18%  (p=0.000 n=100+100)
ReverseBytes16  2.00ns ± 0%  2.21ns ± 3%  +10.07%  (p=0.000 n=88+100)
ReverseBytes32  4.08ns ± 2%  4.13ns ± 2%   +1.39%  (p=0.000 n=99+99)
ReverseBytes64  5.48ns ± 1%  5.45ns ± 1%   -0.50%  (p=0.000 n=98+99)

Update #43403

Change-Id: I7e7e00bb17608739d9f6b927c6dfef2580493a0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/280645
Trust: Meng Zhuo <mzh@golangcn.org>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-03-17 03:18:12 +00:00
Josh Bleecher Snyder
b0df92703c math/big: add shrVU and shlVU benchmarks
Change-Id: Id67d6ac856bd9271de99c3381bde910aa0c166e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/296011
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2021-03-07 23:02:35 +00:00
Josh Bleecher Snyder
60b500dc6c math/big: remove bounds checks for shrVU_g inner loop
Make explicit a shrVU_g precondition.
Replace i with i+1 throughout the loop.
The resulting loop is functionally identical,
but the compiler can do better BCE without the i-1 slice offset.

Benchmarks results on amd64 with -tags=math_big_pure_go.

name                          old time/op  new time/op  delta
NonZeroShifts/1/shrVU-8       4.55ns ± 2%  4.45ns ± 3%   -2.27%  (p=0.000 n=28+30)
NonZeroShifts/1/shlVU-8       4.07ns ± 1%  4.13ns ± 4%   +1.55%  (p=0.000 n=26+29)
NonZeroShifts/2/shrVU-8       6.12ns ± 1%  5.55ns ± 1%   -9.30%  (p=0.000 n=28+28)
NonZeroShifts/2/shlVU-8       5.65ns ± 3%  5.70ns ± 2%   +0.92%  (p=0.008 n=30+29)
NonZeroShifts/3/shrVU-8       7.58ns ± 2%  6.79ns ± 2%  -10.46%  (p=0.000 n=28+28)
NonZeroShifts/3/shlVU-8       6.62ns ± 2%  6.69ns ± 1%   +1.07%  (p=0.000 n=29+28)
NonZeroShifts/4/shrVU-8       9.02ns ± 1%  7.79ns ± 2%  -13.59%  (p=0.000 n=27+30)
NonZeroShifts/4/shlVU-8       7.74ns ± 1%  7.82ns ± 1%   +0.92%  (p=0.000 n=26+28)
NonZeroShifts/5/shrVU-8       10.6ns ± 1%   8.9ns ± 3%  -16.31%  (p=0.000 n=25+29)
NonZeroShifts/5/shlVU-8       8.59ns ± 1%  8.68ns ± 1%   +1.13%  (p=0.000 n=27+29)
NonZeroShifts/10/shrVU-8      18.2ns ± 2%  14.4ns ± 1%  -20.96%  (p=0.000 n=27+28)
NonZeroShifts/10/shlVU-8      14.1ns ± 1%  14.1ns ± 1%   +0.46%  (p=0.001 n=26+28)
NonZeroShifts/100/shrVU-8      161ns ± 2%   118ns ± 1%  -26.83%  (p=0.000 n=29+30)
NonZeroShifts/100/shlVU-8      119ns ± 2%   120ns ± 2%   +0.92%  (p=0.000 n=29+29)
NonZeroShifts/1000/shrVU-8    1.54µs ± 1%  1.10µs ± 1%  -28.63%  (p=0.000 n=29+29)
NonZeroShifts/1000/shlVU-8    1.10µs ± 1%  1.10µs ± 2%     ~     (p=0.701 n=28+29)
NonZeroShifts/10000/shrVU-8   15.3µs ± 2%  10.9µs ± 1%  -28.68%  (p=0.000 n=28+28)
NonZeroShifts/10000/shlVU-8   10.9µs ± 2%  10.9µs ± 2%   -0.57%  (p=0.003 n=26+29)
NonZeroShifts/100000/shrVU-8   154µs ± 1%   111µs ± 2%  -28.04%  (p=0.000 n=27+28)
NonZeroShifts/100000/shlVU-8   113µs ± 2%   113µs ± 2%     ~     (p=0.790 n=30+30)

Change-Id: Ib6a621ee7c88b27f0f18121fb2cba3606c40c9b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/297049
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2021-03-05 06:15:22 +00:00
fanzha02
2b50ab2aee cmd/compile: optimize single-precision floating point square root
Add generic rule to rewrite the single-precision square root expression
with one single-precision instruction. The optimization will reduce two
times of precision converting between double-precision and single-precision.

On arm64 flatform.

previous:
  FCVTSD F0, F0
  FSQRTD F0, F0
  FCVTDS F0, F0

optimized:
  FSQRTS S0, S0

And this patch adds the test case to check the correctness.

This patch refers to CL 241877, contributed by Alice Xu
(dianhong.xu@arm.com)

Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/276873
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-03-02 06:38:07 +00:00
Russ Cox
d4b2638234 all: go fmt std cmd (but revert vendor)
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).

Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild

Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-02-20 03:54:50 +00:00
Russ Cox
474d5f4f4d math: remove most 387 implementations
The Surface Pro X's 386 simulator is not completely faithful to a real 387.
The most egregious problem is that it computes Log2(8) as 2.9999999999999996,
but it has some other subtler problems as well. All the problems occur in
routines that we don't even bother with assembly for on amd64.
If the speed of Go code is OK on amd64 it should be OK on 386 too.
Just remove all the 386-only assembly functions.

This leaves Ceil, Floor, Trunc, Hypot, and Sqrt in 386 assembly,
all of which are also in assembly on amd64 and all of which pass
their tests on Surface Pro X.

Compared to amd64, the 386 port omits assembly for Min, Max, and Log.
It never had Min and Max, and this CL deletes Log because Log2 wasn't
even correct. (None of the other architectures have assembly Log either.)

Change-Id: I5eb6c61084467035269d4098a36001447b7a0601
Reviewed-on: https://go-review.googlesource.com/c/go/+/291229
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2021-02-19 00:41:09 +00:00
Katie Hockman
e491c6eea9 math/big: fix comment in divRecursiveStep
There appears to be a typo in the description of
the recursive division algorithm.

Two things seem suspicious with the original comment:
  1. It is talking about choosing s, but s doesn't
     appear anywhere in the equation.
  2. The math in the equation is incorrect.

Where
  B = len(v)/2
  s = B - 1

Proof that it is incorrect:
    len(v) - B >= B + 1
    len(v) - len(v)/2 >= len(v)/2 + 1

    This doesn't hold if len(v) is even, e.g. 10:
    10 - 10/2 >= 10/2 + 1
    10 - 5 >= 5 + 1
    5 >= 6  // this is false

The new equation will be the following,
which will be mathematically correct:
    len(v) - s >= B + 1
    len(v) - (len(v)/2 - 1) >= len(v)/2 + 1
    len(v) - len(v)/2 + 1 >= len(v)/2 + 1
    len(v) - len(v)/2 >= len(v)/2

    This holds if len(v) is even or odd.

    e.g. 10
    10 - 10/2 >= 10/2
    10 - 5 >= 5
    5 >= 5

    e.g. 11
    11 - 11/2 >= 11/2
    11 - 5 >= 5
    6 >= 5

Change-Id: If77ce09286cf7038637b5dfd0fb7d4f828023f56
Reviewed-on: https://go-review.googlesource.com/c/go/+/287372
Run-TryBot: Katie Hockman <katie@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Trust: Katie Hockman <katie@golang.org>
2021-02-03 15:04:02 +00:00
Paul Davis
0f797f168d math: fix typo in sqrt.go code comment
"it does not necessary" -> "it is not necessary"

Change-Id: I66f9cf2670d76b3686badb4a537b3ec084447d62
GitHub-Last-Rev: 52a0f9993a
GitHub-Pull-Request: golang/go#43935
Reviewed-on: https://go-review.googlesource.com/c/go/+/287052
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Robert Griesemer <gri@golang.org>
2021-01-27 00:15:06 +00:00
Toasa
9eef49cfa6 math/rand: fix typo in comment
Change-Id: I57fbabf272bdfd61918db155ee6f7091f18e5979
GitHub-Last-Rev: e138804b1a
GitHub-Pull-Request: golang/go#43495
Reviewed-on: https://go-review.googlesource.com/c/go/+/281373
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
2021-01-04 17:59:30 +00:00
Katie Hockman
dea6d94a44 math/big: add test for recursive division panic
The vulnerability that allowed this panic is
CVE-2020-28362 and has been fixed in a security
release, per #42552.

Change-Id: I774bcda2cc83cdd5a273d21c8d9f4b53fa17c88f
Reviewed-on: https://go-review.googlesource.com/c/go/+/277959
Run-TryBot: Katie Hockman <katie@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Katie Hockman <katie@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
2020-12-14 20:56:03 +00:00
Russ Cox
4f1b0a44cb all: update to use os.ReadFile, os.WriteFile, os.CreateTemp, os.MkdirTemp
As part of #42026, these helpers from io/ioutil were moved to os.
(ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.)

Update the Go tree to use the preferred names.

As usual, code compiled with the Go 1.4 bootstrap toolchain
and code vendored from other sources is excluded.

ReadDir changes are in a separate CL, because they are not a
simple search and replace.

For #42026.

Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/266365
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-12-09 19:12:23 +00:00
Jonathan Albrecht
b1369d5862 math/big: remove the s390x assembly for shlVU and shrVU
The s390x assembly for shlVU does a forward copy when the shift amount s
is 0. This causes corruption of the result z when z is aliased to the
input x.

This fix removes the s390x assembly for both shlVU and shrVU so the pure
go implementations will be used.

Test cases have been added to the existing TestShiftOverlap test to
cover shift values of 0, 1 and (_W - 1).

Fixes #42838

Change-Id: I75ca0e98f3acfaa6366a26355dcd9dd82499a48b
Reviewed-on: https://go-review.googlesource.com/c/go/+/274442
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Trust: Robert Griesemer <gri@golang.org>
2020-12-03 18:43:06 +00:00
Katie Hockman
1e1fa5903b math/big: fix shift for recursive division
The previous s value could cause a crash
for certain inputs.

Will check in tests and documentation improvements later.

Thanks to the Go Ethereum team and the OSS-Fuzz project for reporting this.
Thanks to Rémy Oudompheng and Robert Griesemer for their help
developing and validating the fix.

Fixes CVE-2020-28362

Change-Id: Ibbf455c4436bcdb07c84a34fa6551fb3422356d3
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/899974
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Filippo Valsorda <valsorda@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/269657
Trust: Katie Hockman <katie@golang.org>
Trust: Roland Shoemaker <roland@golang.org>
Run-TryBot: Katie Hockman <katie@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
2020-11-12 20:42:40 +00:00
surechen
f588974a52 math/big: reduce allocations for building decimal strings
Append operations in the decimal String function may cause several allocations.
Use make to pre allocate slices in String that have enough capacity to avoid additional allocations in append operations.

name                 old time/op  new time/op  delta
DecimalConversion-8   139µs ± 7%   109µs ± 2%  -21.06%  (p=0.000 n=10+10)

Change-Id: Id0284d204918a179a0421c51c35d86a3408e1bd9
Reviewed-on: https://go-review.googlesource.com/c/go/+/233980
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Trust: Giovanni Bajo <rasky@develer.com>
Trust: Martin Möhrmann <moehrmann@google.com>
2020-10-29 22:45:29 +00:00
SparrowLii
d54a9a9c42 math/big: replace division with multiplication by reciprocal word
Division is much slower than multiplication. And the method of using
multiplication by multiplying reciprocal and replacing division with it
can increase the speed of divWVW algorithm by three times,and at the
same time increase the speed of nats division.

The benchmark test on arm64 is as follows:
name                     old time/op    new time/op    delta
DivWVW/1-4                 13.1ns ± 4%    13.3ns ± 4%      ~     (p=0.444 n=5+5)
DivWVW/2-4                 48.6ns ± 1%    51.2ns ± 2%    +5.39%  (p=0.008 n=5+5)
DivWVW/3-4                 82.0ns ± 1%    69.7ns ± 1%   -15.03%  (p=0.008 n=5+5)
DivWVW/4-4                  116ns ± 1%      71ns ± 2%   -38.88%  (p=0.008 n=5+5)
DivWVW/5-4                  152ns ± 1%      84ns ± 4%   -44.70%  (p=0.008 n=5+5)
DivWVW/10-4                 319ns ± 1%     155ns ± 4%   -51.50%  (p=0.008 n=5+5)
DivWVW/100-4               3.44µs ± 3%    1.30µs ± 8%   -62.30%  (p=0.008 n=5+5)
DivWVW/1000-4              33.8µs ± 0%    10.9µs ± 1%   -67.74%  (p=0.008 n=5+5)
DivWVW/10000-4              343µs ± 4%     111µs ± 5%   -67.63%  (p=0.008 n=5+5)
DivWVW/100000-4            3.35ms ± 1%    1.25ms ± 3%   -62.79%  (p=0.008 n=5+5)
QuoRem-4                   3.08µs ± 2%    2.21µs ± 4%   -28.40%  (p=0.008 n=5+5)
ModSqrt225_Tonelli-4        444µs ± 2%     457µs ± 3%      ~     (p=0.095 n=5+5)
ModSqrt225_3Mod4-4          136µs ± 1%     138µs ± 3%      ~     (p=0.151 n=5+5)
ModSqrt231_Tonelli-4        473µs ± 3%     483µs ± 4%      ~     (p=0.548 n=5+5)
ModSqrt231_5Mod8-4          164µs ± 9%     169µs ±12%      ~     (p=0.421 n=5+5)
Sqrt-4                     36.8µs ± 1%    28.6µs ± 0%   -22.17%  (p=0.016 n=5+4)
Div/20/10-4                50.0ns ± 3%    51.3ns ± 6%      ~     (p=0.238 n=5+5)
Div/40/20-4                49.8ns ± 2%    51.3ns ± 6%      ~     (p=0.222 n=5+5)
Div/100/50-4               85.8ns ± 4%    86.5ns ± 5%	   ~     (p=0.246 n=5+5)
Div/200/100-4               335ns ± 3%     296ns ± 2%   -11.60%  (p=0.008 n=5+5)
Div/400/200-4               442ns ± 2%     359ns ± 5%   -18.81%  (p=0.008 n=5+5)
Div/1000/500-4              858ns ± 3%     643ns ± 6%   -25.06%  (p=0.008 n=5+5)
Div/2000/1000-4            1.70µs ± 3%    1.28µs ± 4%   -24.80%  (p=0.008 n=5+5)
Div/20000/10000-4          45.0µs ± 5%    41.8µs ± 4%    -7.17%  (p=0.016 n=5+5)
Div/200000/100000-4        1.51ms ± 7%    1.43ms ± 3%    -5.42%  (p=0.016 n=5+5)
Div/2000000/1000000-4      57.6ms ± 4%    57.5ms ± 3%      ~     (p=1.000 n=5+5)
Div/20000000/10000000-4     2.08s ± 3%     2.04s ± 1%      ~     (p=0.095 n=5+5)

name                     old speed      new speed      delta
DivWVW/1-4               4.87GB/s ± 4%  4.80GB/s ± 4%      ~     (p=0.310 n=5+5)
DivWVW/2-4               2.63GB/s ± 1%  2.50GB/s ± 2%    -5.07%  (p=0.008 n=5+5)
DivWVW/3-4               2.34GB/s ± 1%  2.76GB/s ± 1%   +17.70%  (p=0.008 n=5+5)
DivWVW/4-4               2.21GB/s ± 1%  3.61GB/s ± 2%   +63.42%  (p=0.008 n=5+5)
DivWVW/5-4               2.10GB/s ± 2%  3.81GB/s ± 4%   +80.89%  (p=0.008 n=5+5)
DivWVW/10-4              2.01GB/s ± 0%  4.13GB/s ± 4%  +105.91%  (p=0.008 n=5+5)
DivWVW/100-4             1.86GB/s ± 2%  4.95GB/s ± 7%  +165.63%  (p=0.008 n=5+5)
DivWVW/1000-4            1.89GB/s ± 0%  5.86GB/s ± 1%  +209.96%  (p=0.008 n=5+5)
DivWVW/10000-4           1.87GB/s ± 4%  5.76GB/s ± 5%  +208.96%  (p=0.008 n=5+5)
DivWVW/100000-4          1.91GB/s ± 1%  5.14GB/s ± 3%  +168.85%  (p=0.008 n=5+5)

Change-Id: I049f1196562b20800e6ef8a6493fd147f93ad830
Reviewed-on: https://go-review.googlesource.com/c/go/+/250417
Trust: Giovanni Bajo <rasky@develer.com>
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-09-23 21:55:55 +00:00
surechen
73eb24ccb6 math: Remove redundant local variable Ln2
Use the const variable Ln2 in math/const.go for function acosh.

Change-Id: I5381d03dd3142c227ae5773ece9be6c8f377615e
Reviewed-on: https://go-review.googlesource.com/c/go/+/232517
Reviewed-by: Robert Griesemer <gri@golang.org>
Trust: Robert Griesemer <gri@golang.org>
Trust: Giovanni Bajo <rasky@develer.com>
2020-09-19 09:09:52 +00:00
Xiangdong Ji
ae7b6a3b77 math/big: tune addVW/subVW performance on arm64
Add an optimization for addVW and subVW over large-sized vectors, it switches
from add/sub with carry to copy the rest of the vector when we are done with
carries. Consistent performance improvement are observed on various arm64
machines.

Add additional tests and benchmarks to increase the test coverage.
TestFunVWExt:
  Testing with various types of input vector, using the result from go-version
  addVW/subVW as golden reference.
BenchmarkAddVWext and BenchmarkSubVWext:
  Benchmarking using input vector having all 1s or all 0s, for evaluating the
  overhead of worst case.

1. Perf. comparison over randomly generated input vectors:

Server 1:
name             old time/op    new time/op    delta
AddVW/1            12.3ns ± 3%    12.0ns ± 0%    -2.60%  (p=0.001 n=10+8)
AddVW/2            12.5ns ± 2%    12.3ns ± 0%    -1.84%  (p=0.001 n=10+8)
AddVW/3            12.6ns ± 2%    12.3ns ± 0%    -1.91%  (p=0.009 n=10+10)
AddVW/4            13.1ns ± 3%    12.7ns ± 0%    -2.98%  (p=0.006 n=10+8)
AddVW/5            14.4ns ± 1%    13.9ns ± 0%    -3.81%  (p=0.000 n=10+10)
AddVW/10           11.7ns ± 0%    11.7ns ± 0%      ~     (all equal)
AddVW/100          47.8ns ± 0%    29.9ns ± 2%   -37.38%  (p=0.000 n=10+9)
AddVW/1000          446ns ± 0%     207ns ± 0%   -53.59%  (p=0.000 n=10+10)
AddVW/10000        4.35µs ± 1%    2.92µs ± 0%   -32.85%  (p=0.000 n=10+10)
AddVW/100000       43.6µs ± 0%    29.7µs ± 0%   -31.92%  (p=0.000 n=8+10)
SubVW/1            12.6ns ± 0%    12.3ns ± 2%    -2.22%  (p=0.000 n=7+10)
SubVW/2            12.7ns ± 0%    12.6ns ± 1%    -0.39%  (p=0.046 n=8+10)
SubVW/3            12.7ns ± 1%    12.6ns ± 1%      ~     (p=0.410 n=10+10)
SubVW/4            13.3ns ± 3%    13.1ns ± 3%      ~     (p=0.072 n=10+10)
SubVW/5            14.2ns ± 0%    14.1ns ± 1%    -0.63%  (p=0.046 n=8+10)
SubVW/10           11.7ns ± 0%    11.7ns ± 0%      ~     (all equal)
SubVW/100          47.8ns ± 0%    33.1ns ±19%   -30.71%  (p=0.000 n=10+10)
SubVW/1000          446ns ± 0%     207ns ± 0%   -53.59%  (p=0.000 n=10+10)
SubVW/10000        4.33µs ± 1%    2.92µs ± 0%   -32.66%  (p=0.000 n=10+6)
SubVW/100000       43.4µs ± 0%    29.6µs ± 0%   -31.90%  (p=0.000 n=10+9)

Server 2:
name             old time/op    new time/op    delta
AddVW/1            5.49ns ± 0%    5.53ns ± 2%     ~     (p=1.000 n=9+10)
AddVW/2            5.96ns ± 2%    5.92ns ± 1%   -0.69%  (p=0.039 n=10+10)
AddVW/3            6.72ns ± 0%    6.73ns ± 0%     ~     (p=0.078 n=10+10)
AddVW/4            7.07ns ± 0%    6.75ns ± 2%   -4.55%  (p=0.000 n=10+10)
AddVW/5            8.14ns ± 0%    8.17ns ± 0%   +0.46%  (p=0.003 n=8+8)
AddVW/10           10.0ns ± 0%    10.1ns ± 1%   +0.70%  (p=0.003 n=10+10)
AddVW/100          43.0ns ± 0%    33.5ns ± 0%  -22.09%  (p=0.000 n=9+9)
AddVW/1000          394ns ± 0%     278ns ± 0%  -29.44%  (p=0.000 n=10+10)
AddVW/10000        4.18µs ± 0%    3.14µs ± 0%  -24.81%  (p=0.000 n=8+8)
AddVW/100000       68.3µs ± 3%    62.1µs ± 5%   -9.13%  (p=0.000 n=10+10)
SubVW/1            5.37ns ± 2%    5.42ns ± 1%     ~     (p=0.990 n=10+10)
SubVW/2            5.89ns ± 0%    5.92ns ± 1%   +0.58%  (p=0.000 n=8+10)
SubVW/3            6.64ns ± 1%    6.82ns ± 3%   +2.63%  (p=0.000 n=9+10)
SubVW/4            7.17ns ± 0%    6.69ns ± 2%   -6.74%  (p=0.000 n=10+9)
SubVW/5            8.22ns ± 0%    8.18ns ± 0%   -0.46%  (p=0.001 n=8+9)
SubVW/10           10.0ns ± 1%    10.1ns ± 1%     ~     (p=0.341 n=10+10)
SubVW/100          43.0ns ± 0%    33.5ns ± 0%  -22.09%  (p=0.000 n=7+10)
SubVW/1000          394ns ± 0%     278ns ± 0%  -29.44%  (p=0.000 n=10+10)
SubVW/10000        4.18µs ± 0%    3.15µs ± 0%  -24.62%  (p=0.000 n=9+9)
SubVW/100000       67.7µs ± 4%    62.4µs ± 2%   -7.92%  (p=0.000 n=10+10)

2. Perf. comparison over input vectors of all 1s or all 0s

Server 1:
name             old time/op    new time/op    delta
AddVWext/1         12.6ns ± 0%    12.0ns ± 0%    -4.76%  (p=0.000 n=6+10)
AddVWext/2         12.7ns ± 0%    12.4ns ± 1%    -2.52%  (p=0.000 n=10+10)
AddVWext/3         12.7ns ± 0%    12.4ns ± 0%    -2.36%  (p=0.000 n=9+7)
AddVWext/4         13.2ns ± 4%    12.7ns ± 0%    -3.71%  (p=0.001 n=10+9)
AddVWext/5         14.6ns ± 0%    13.9ns ± 0%    -4.79%  (p=0.000 n=10+8)
AddVWext/10        11.7ns ± 0%    11.7ns ± 0%      ~     (all equal)
AddVWext/100       47.8ns ± 0%    47.4ns ± 0%    -0.84%  (p=0.000 n=10+10)
AddVWext/1000       446ns ± 0%     399ns ± 0%   -10.54%  (p=0.000 n=10+10)
AddVWext/10000     4.34µs ± 1%    3.90µs ± 0%   -10.12%  (p=0.000 n=10+10)
AddVWext/100000    43.9µs ± 1%    39.4µs ± 0%   -10.18%  (p=0.000 n=10+10)
SubVWext/1         12.6ns ± 0%    12.3ns ± 2%    -2.70%  (p=0.000 n=7+10)
SubVWext/2         12.6ns ± 1%    12.6ns ± 2%      ~     (p=0.234 n=10+10)
SubVWext/3         12.7ns ± 0%    12.6ns ± 2%    -0.71%  (p=0.033 n=10+10)
SubVWext/4         13.4ns ± 0%    13.1ns ± 3%    -2.01%  (p=0.006 n=8+10)
SubVWext/5         14.2ns ± 0%    14.1ns ± 1%    -0.85%  (p=0.003 n=10+10)
SubVWext/10        11.7ns ± 0%    11.7ns ± 0%      ~     (all equal)
SubVWext/100       47.8ns ± 0%    47.4ns ± 0%    -0.84%  (p=0.000 n=10+10)
SubVWext/1000       446ns ± 0%     399ns ± 0%   -10.54%  (p=0.000 n=10+10)
SubVWext/10000     4.33µs ± 1%    3.90µs ± 0%   -10.02%  (p=0.000 n=10+10)
SubVWext/100000    43.5µs ± 0%    39.5µs ± 1%    -9.16%  (p=0.000 n=7+10)

Server 2:
name             old time/op    new time/op    delta
AddVWext/1         5.48ns ± 0%    5.43ns ± 1%   -0.97%  (p=0.000 n=9+9)
AddVWext/2         5.99ns ± 2%    5.93ns ± 1%     ~     (p=0.054 n=10+10)
AddVWext/3         6.74ns ± 0%    6.79ns ± 1%   +0.80%  (p=0.000 n=9+10)
AddVWext/4         7.18ns ± 0%    7.21ns ± 1%   +0.36%  (p=0.034 n=9+10)
AddVWext/5         7.93ns ± 3%    8.18ns ± 0%   +3.18%  (p=0.000 n=10+8)
AddVWext/10        10.0ns ± 0%    10.1ns ± 1%   +0.60%  (p=0.011 n=10+10)
AddVWext/100       43.0ns ± 0%    47.7ns ± 0%  +10.93%  (p=0.000 n=9+10)
AddVWext/1000       394ns ± 0%     399ns ± 0%   +1.27%  (p=0.000 n=10+10)
AddVWext/10000     4.18µs ± 0%    4.50µs ± 0%   +7.73%  (p=0.000 n=9+10)
AddVWext/100000    67.6µs ± 2%    68.4µs ± 3%     ~     (p=0.139 n=9+8)
SubVWext/1         5.46ns ± 1%    5.43ns ± 0%   -0.55%  (p=0.002 n=9+9)
SubVWext/2         5.89ns ± 0%    5.93ns ± 1%   +0.68%  (p=0.000 n=8+10)
SubVWext/3         6.72ns ± 1%    6.79ns ± 1%   +1.07%  (p=0.000 n=10+10)
SubVWext/4         6.98ns ± 1%    7.21ns ± 0%   +3.25%  (p=0.000 n=10+10)
SubVWext/5         8.22ns ± 0%    7.99ns ± 3%   -2.83%  (p=0.000 n=8+10)
SubVWext/10        10.0ns ± 1%    10.1ns ± 1%     ~     (p=0.239 n=10+10)
SubVWext/100       43.0ns ± 0%    47.7ns ± 0%  +10.93%  (p=0.000 n=8+10)
SubVWext/1000       394ns ± 0%     399ns ± 0%   +1.27%  (p=0.000 n=10+10)
SubVWext/10000     4.18µs ± 0%    4.51µs ± 0%   +7.86%  (p=0.000 n=8+8)
SubVWext/100000    68.3µs ± 2%    68.0µs ± 3%     ~     (p=0.515 n=10+8)

Change-Id: I134a5194b8a2deaaebbaa2b771baf72846971d58
Reviewed-on: https://go-review.googlesource.com/c/go/+/229739
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-08-28 16:40:41 +00:00
surechen
bd6dfe9a3e math/big: add a comment for SetMantExp
Change-Id: I9ff5d1767cf70648c2251268e5e815944a7cb371
Reviewed-on: https://go-review.googlesource.com/c/go/+/233737
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-08-28 16:25:32 +00:00
zhouzhongyuan
63828096f6 math/big: add function example
While reading the source code of the math/big package, I found the SetString function example of float type missing.

Change-Id: Id8c16a58e2e24f9463e8ff38adbc98f8c418ab26
Reviewed-on: https://go-review.googlesource.com/c/go/+/232804
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-08-26 16:15:32 +00:00
SparrowLii
41bc0a1713 math/big: fix TestShiftOverlap for test -count arguments > 1
Don't overwrite incoming test data.

The change uses copy instead of assigning statement to avoid this.

Change-Id: Ib907101822d811de5c45145cb9d7961907e212c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/250137
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-08-25 17:04:40 +00:00
Lynn Boger
216714e44f math/big: improve performance of mulAddVWW on ppc64x
This changes the assembly implementation on ppc64x
to improve performance by reordering some instructions.
It also eliminates an unnecessary move by changing an
ADDZE to use the correct target register.

Improvement on power9:

MulAddVWW/1         6.89ns ± 0%    7.30ns ± 0%   +5.95%  (p=1.000 n=1+1)
MulAddVWW/2         8.04ns ± 0%    8.06ns ± 0%   +0.25%  (p=1.000 n=1+1)
MulAddVWW/3         9.39ns ± 0%    9.39ns ± 0%     ~     (all equal)
MulAddVWW/4         9.76ns ± 0%    9.48ns ± 0%   -2.87%  (p=1.000 n=1+1)
MulAddVWW/5         10.5ns ± 0%    10.3ns ± 0%   -1.90%  (p=1.000 n=1+1)
MulAddVWW/10        15.4ns ± 0%    14.9ns ± 0%   -3.25%  (p=1.000 n=1+1)
MulAddVWW/100        149ns ± 0%     125ns ± 0%  -16.11%  (p=1.000 n=1+1)
MulAddVWW/1000      1.42µs ± 0%    1.28µs ± 0%   -9.74%  (p=1.000 n=1+1)
MulAddVWW/10000     14.2µs ± 0%    12.8µs ± 0%   -9.73%  (p=1.000 n=1+1)
MulAddVWW/100000     144µs ± 0%     129µs ± 0%  -10.10%  (p=1.000 n=1+1)

Change-Id: I0ae7002a69783ca19d7a4e3e42042ae75dc60069
Reviewed-on: https://go-review.googlesource.com/c/go/+/248721
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
2020-08-18 20:25:26 +00:00
kakulisen
441b52f566 math: simplify the code
Simplifying some code without compromising performance.
My CPU is Intel Xeon Gold 6161, 2.20GHz, 64-bit operating system.
The memory is 8GB. This is my test environment, I hope to help you judge.

Benchmark:

name      old time/op    new time/op    delta
Log1p-4    21.8ns ± 5%    21.8ns ± 4%   ~     (p=0.973 n=20+20)

Change-Id: Icd8f96f1325b00007602d114300b92d4c57de409
Reviewed-on: https://go-review.googlesource.com/c/go/+/233940
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-08-15 02:20:42 +00:00
Alberto Donizetti
65f514edfb math: fix dead link to springerlink (now link.springer)
Change-Id: Ie5fd026af45d2e7bc371a38d15dbb52a1b4958cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/235717
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2020-05-29 14:33:50 +00:00
Bryan C. Mills
13617380ca testing: clean up remaining TempDir issues from CL 231958
Updates #38850

Change-Id: I33f48762f5520eb0c0a841d8ca1ccdd65ecc20c8
Reviewed-on: https://go-review.googlesource.com/c/go/+/234583
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-05-19 19:55:14 +00:00
Filippo Valsorda
c9d5f60eaa math/big: add (*Int).FillBytes
Replaced almost every use of Bytes with FillBytes.

Note that the approved proposal was for

    func (*Int) FillBytes(buf []byte)

while this implements

    func (*Int) FillBytes(buf []byte) []byte

because the latter was far nicer to use in all callsites.

Fixes #35833

Change-Id: Ia912df123e5d79b763845312ea3d9a8051343c0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/230397
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-05-05 00:36:44 +00:00
Joel Sing
03e6073b13 math: implement Min/Max in riscv64 assembly
Change-Id: If34422859d47bc8f44974a00c6b7908e7655ff41
Reviewed-on: https://go-review.googlesource.com/c/go/+/223561
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-05-04 17:29:13 +00:00
kakulisen
e90b0ce68b math: add function examples.
The function Modf lacks corresponding examples.

Change-Id: Id93423500e87d35b0b6870882be1698b304797ae
Reviewed-on: https://go-review.googlesource.com/c/go/+/231097
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-05-02 20:22:19 +00:00
Brian Kessler
4209a9f65a math/cmplx: handle special cases
Implement special case handling and testing to ensure
conformance with the C99 standard annex G.6 Complex arithmetic.

Fixes #29320

Change-Id: Id72eb4c5a35d5a54b4b8690d2f7176ab11028f1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/220689
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-05-01 03:16:37 +00:00
kakulisen
df2862cf54 math: Add a function example
When I browsed the source code, I saw that there is no corresponding example of this function. I am not sure if there is a need for an increase, this is my first time to submit CL.

Change-Id: Idbf4e1e1ed2995176a76959d561e152263a2fd26
Reviewed-on: https://go-review.googlesource.com/c/go/+/230741
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-04-30 00:33:38 +00:00
Ruixin(Peter) Bao
a7e9e84716 math/big: simplify hasVX checking on s390x
Originally, we use an assembly function that returns a boolean result to
tell whether the machine has vector facility or not. It is now no longer
needed when we can directly use cpu.S390X.HasVX variable.

Change-Id: Ic1dae851982532bcfd9a9453416c112347f21d87
Reviewed-on: https://go-review.googlesource.com/c/go/+/230318
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-27 20:20:53 +00:00
Ruixin Bao
d2f5e4e38c math: simplify hasVX checking on s390x
Originally, we use an assembly function that returns a boolean result to
tell whether the machine has vector facility or not. It is now no longer
needed when we can directly use cpu.S390X.HasVX variable.

Change-Id: Ic3ffeb9e63238ef41406d97cdc42502145ddb454
Reviewed-on: https://go-review.googlesource.com/c/go/+/230319
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-27 20:06:57 +00:00
Tyson Andre
19648622d2 math/cmplx: fix typo in code comment
Everywhere else is using "cancellation" as of 2019

The reasoning is mentioned in 170060.

> Though there is variation in the spelling of canceled,
> cancellation is always spelled with a double l.
>
> Reference: https://www.grammarly.com/blog/canceled-vs-cancelled/

Change-Id: I933ea68d7251986ce582b92c33b7cb13cee1d207
GitHub-Last-Rev: fc3d5ada2b
GitHub-Pull-Request: golang/go#38661
Reviewed-on: https://go-review.googlesource.com/c/go/+/230199
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2020-04-25 21:06:35 +00:00
Ruixin(Peter) Bao
3a37fd4010 math/big: rewrite subVW to use fast path on s390x
This CL replaces the original subVW implementation with a implementation
that uses a similar idea as CL 164968.

When we know the borrow bit is zero, we can copy the rest of words as
they will not be updated. Also, since we are copying vector of a words,
a faster implementation of copy is written in this CL to copy a word or
multiple words at a time.

Benchmarks:
name             old time/op    new time/op     delta
SubVW/1-18         4.43ns ± 0%     3.82ns ± 0%   -13.85%  (p=0.000 n=20+20)
SubVW/2-18         5.39ns ± 0%     4.25ns ± 0%   -21.23%  (p=0.000 n=20+20)
SubVW/3-18         6.29ns ± 0%     4.65ns ± 0%   -26.07%  (p=0.000 n=16+19)
SubVW/4-18         6.08ns ± 2%     4.84ns ± 0%   -20.43%  (p=0.000 n=20+20)
SubVW/5-18         7.06ns ± 1%     4.93ns ± 0%   -30.18%  (p=0.000 n=20+20)
SubVW/10-18        10.3ns ± 2%      7.2ns ± 0%   -30.35%  (p=0.000 n=20+19)
SubVW/100-18       48.0ns ± 4%     17.6ns ± 0%   -63.32%  (p=0.000 n=18+19)
SubVW/1000-18       448ns ±10%      236ns ± 1%   -47.24%  (p=0.000 n=20+20)
SubVW/10000-18     4.83µs ± 5%     2.96µs ± 0%   -38.73%  (p=0.000 n=20+19)
SubVW/100000-18    46.6µs ± 3%     30.6µs ± 1%   -34.30%  (p=0.000 n=20+20)
[Geo mean]         56.3ns          37.0ns        -34.24%

name             old speed      new speed       delta
SubVW/1-18       1.80GB/s ± 0%   2.10GB/s ± 0%   +16.16%  (p=0.000 n=20+20)
SubVW/2-18       2.97GB/s ± 0%   3.77GB/s ± 0%   +26.95%  (p=0.000 n=20+20)
SubVW/3-18       3.82GB/s ± 0%   5.16GB/s ± 0%   +35.26%  (p=0.000 n=20+19)
SubVW/4-18       5.26GB/s ± 1%   6.61GB/s ± 0%   +25.59%  (p=0.000 n=20+20)
SubVW/5-18       5.67GB/s ± 1%   8.11GB/s ± 0%   +43.12%  (p=0.000 n=20+20)
SubVW/10-18      7.79GB/s ± 2%  11.17GB/s ± 0%   +43.52%  (p=0.000 n=20+19)
SubVW/100-18     16.7GB/s ± 4%   45.5GB/s ± 0%  +172.61%  (p=0.000 n=18+20)
SubVW/1000-18    17.9GB/s ± 9%   33.9GB/s ± 1%   +89.25%  (p=0.000 n=20+20)
SubVW/10000-18   16.6GB/s ± 5%   27.0GB/s ± 0%   +63.08%  (p=0.000 n=20+19)
SubVW/100000-18  17.2GB/s ± 2%   26.1GB/s ± 1%   +52.18%  (p=0.000 n=20+20)
[Geo mean]       7.25GB/s       11.03GB/s        +52.01%

Change-Id: I32e99cbab3260054a96231d02b87049c833ab77e
Reviewed-on: https://go-review.googlesource.com/c/go/+/227297
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-24 14:50:59 +00:00
Ruixin(Peter) Bao
ee8972cd12 math/big: rewrite addVW to use fast path on s390x
Rewrite addVW to use a fast path and remove the original
vector and non vector implementation of addVW in assembly. This CL uses
a similar idea as CL 164968, where we copy the rest of words when we
know carry bit is zero.

In addition, since we are copying vector of words, a faster
implementation of copy is written in this CL to copy a word or multiple
words at a time.

Benchmarks:
name             old time/op    new time/op     delta
AddVW/1-18         4.56ns ± 0%     4.01ns ± 6%   -12.14%  (p=0.000 n=18+20)
AddVW/2-18         5.54ns ± 0%     4.42ns ± 5%   -20.20%  (p=0.000 n=18+20)
AddVW/3-18         6.55ns ± 0%     4.61ns ± 0%   -29.62%  (p=0.000 n=16+18)
AddVW/4-18         6.11ns ± 2%     5.12ns ± 6%   -16.19%  (p=0.000 n=20+20)
AddVW/5-18         7.32ns ± 4%     5.14ns ± 0%   -29.77%  (p=0.000 n=20+19)
AddVW/10-18        10.6ns ± 2%      7.2ns ± 1%   -31.47%  (p=0.000 n=20+20)
AddVW/100-18       49.6ns ± 2%     18.0ns ± 0%   -63.63%  (p=0.000 n=20+20)
AddVW/1000-18       465ns ± 3%      244ns ± 0%   -47.54%  (p=0.000 n=20+20)
AddVW/10000-18     4.99µs ± 4%     2.97µs ± 0%   -40.54%  (p=0.000 n=20+20)
AddVW/100000-18    48.3µs ± 3%     30.8µs ± 1%   -36.29%  (p=0.000 n=20+20)
[Geo mean]         58.1ns          38.0ns        -34.57%

name             old speed      new speed       delta
AddVW/1-18       1.76GB/s ± 0%   2.00GB/s ± 6%   +14.04%  (p=0.000 n=20+20)
AddVW/2-18       2.89GB/s ± 0%   3.63GB/s ± 5%   +25.55%  (p=0.000 n=18+20)
AddVW/3-18       3.66GB/s ± 0%   5.21GB/s ± 0%   +42.25%  (p=0.000 n=18+19)
AddVW/4-18       5.24GB/s ± 2%   6.27GB/s ± 6%   +19.61%  (p=0.000 n=20+20)
AddVW/5-18       5.47GB/s ± 4%   7.78GB/s ± 0%   +42.28%  (p=0.000 n=20+18)
AddVW/10-18      7.55GB/s ± 2%  11.04GB/s ± 1%   +46.09%  (p=0.000 n=20+20)
AddVW/100-18     16.1GB/s ± 2%   44.3GB/s ± 0%  +174.77%  (p=0.000 n=20+20)
AddVW/1000-18    17.2GB/s ± 3%   32.8GB/s ± 1%   +90.58%  (p=0.000 n=20+20)
AddVW/10000-18   16.0GB/s ± 4%   26.9GB/s ± 0%   +68.11%  (p=0.000 n=20+20)
AddVW/100000-18  16.6GB/s ± 3%   26.0GB/s ± 1%   +56.94%  (p=0.000 n=20+20)
[Geo mean]       7.03GB/s       10.75GB/s        +52.93%

Change-Id: Idbb73f3178311bd2b18a93bdc1e48f26869d2f6a
Reviewed-on: https://go-review.googlesource.com/c/go/+/209679
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-24 13:26:34 +00:00