This change adds math/bits as a new dependency of math/big.
- use bits.LeadingZeroes instead of local implementation
(they are identical, so there's no performance loss here)
- leave other functionality local (ntz, bitLen) since there's
faster implementations in math/big at the moment
Change-Id: I1218aa8a1df0cc9783583b090a4bb5a8a145c4a2
Reviewed-on: https://go-review.googlesource.com/37141
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Removes init function from the math package.
Allows stripping of arrays with pre-computed values
used for Pow10 from binaries if Pow10 is not used.
cmd/go shrinks by 128 bytes.
Fixed small values like 10**-323 being 0 instead of 1e-323.
Overall precision is increased but still not as good as
predefined constants for some inputs.
Samples:
Pow10(208)
before: 1.0000000000000006662e+208
after: 1.0000000000000000959e+208
Pow10(202)
before 1.0000000000000009895e+202
after 1.0000000000000001193e+202
Pow10(60)
before 1.0000000000000001278e+60
after 0.9999999999999999494e+60
Pow10(-100)
before 0.99999999999999938551e-100
after 0.99999999999999989309e-100
Pow10(-200)
before 0.9999999999999988218e-200
after 1.0000000000000001271e-200
name old time/op new time/op delta
Pow10Pos-4 44.6ns ± 2% 1.2ns ± 1% -97.39% (p=0.000 n=19+17)
Pow10Neg-4 50.8ns ± 1% 4.1ns ± 2% -92.02% (p=0.000 n=17+19)
Change-Id: If094034286b8ac64be3a95fd9e8ffa3d4ad39b31
Reviewed-on: https://go-review.googlesource.com/36331
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Test finite negative x with Y0(-1), Y1(-1), Yn(2,-1), Yn(-3,-1).
Also test the special case Yn(0,0).
Fixes#19130.
Change-Id: I95f05a72e1c455ed8ddf202c56f4266f03f370fd
Reviewed-on: https://go-review.googlesource.com/37310
Reviewed-by: Robert Griesemer <gri@golang.org>
For compatibility with math/bits uint operations.
When math/big was written originally, the Go compiler used 32bit
int/uint values even on a 64bit machine. uintptr was the type that
represented the machine register size. Now, the int/uint types are
sized to the native machine register size, so they are the natural
machine Word type.
On most machines, the size of int/uint correspond to the size of
uintptr. On platforms where uint and uintptr have different sizes,
this change may lead to performance differences (e.g., amd64p32).
Change-Id: Ief249c160b707b6441848f20041e32e9e9d8d8ca
Reviewed-on: https://go-review.googlesource.com/37372
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Add exported global variables and store the results of benchmarked
functions in them. This prevents the current compiler optimizations
from removing the instructions that are needed to compute the return
values of the benchmarked functions.
Change-Id: If8b08424e85f3796bb6dd73e761c653abbabcc5e
Reviewed-on: https://go-review.googlesource.com/37195
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
- moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m
This permits use of the same constant m twice (*) which may be
better for machines that can't use large immediate constants
directly with an AND instruction and have to load them explicitly.
*) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m)
- simplified returns
This improves the generated code because the compiler recognizes
x>>k | x<<k as ROT when k is the bitsize of x.
The 8-bit versions of these instructions can be significantly faster
still if they are replaced with table lookups, as long as the table
is in cache. If the table is not in cache, table-lookup is probably
slower, hence the choice of an explicit register-only implementation
for now.
BenchmarkReverse-8 8.50 6.86 -19.29%
BenchmarkReverse8-8 2.17 1.74 -19.82%
BenchmarkReverse16-8 2.89 2.34 -19.03%
BenchmarkReverse32-8 3.55 2.95 -16.90%
BenchmarkReverse64-8 6.81 5.57 -18.21%
BenchmarkReverseBytes-8 3.49 2.48 -28.94%
BenchmarkReverseBytes16-8 0.93 0.62 -33.33%
BenchmarkReverseBytes32-8 1.55 1.13 -27.10%
BenchmarkReverseBytes64-8 2.47 2.47 +0.00%
Reverse-8 8.50ns ± 0% 6.86ns ± 0% ~ (p=1.000 n=1+1)
Reverse8-8 2.17ns ± 0% 1.74ns ± 0% ~ (p=1.000 n=1+1)
Reverse16-8 2.89ns ± 0% 2.34ns ± 0% ~ (p=1.000 n=1+1)
Reverse32-8 3.55ns ± 0% 2.95ns ± 0% ~ (p=1.000 n=1+1)
Reverse64-8 6.81ns ± 0% 5.57ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes-8 3.49ns ± 0% 2.48ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes16-8 0.93ns ± 0% 0.62ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes32-8 1.55ns ± 0% 1.13ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes64-8 2.47ns ± 0% 2.47ns ± 0% ~ (all samples are equal)
Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d
Reviewed-on: https://go-review.googlesource.com/37215
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: I280c53be455f2fe0474ad577c0f7b7908a4eccb2
Reviewed-on: https://go-review.googlesource.com/36993
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The tests failed to compile when using the math_big_pure_go tag on
s390x.
Change-Id: I2a09f53ff6562ab9bc9b886cffc0f6205bbfcfbb
Reviewed-on: https://go-review.googlesource.com/36956
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The s390x port was based on the ppc64 port and, because of the way the
port was done, inherited some instructions from it. ppc64 supports
3-operand (4-operand for FMADD etc.) floating point instructions
but s390x doesn't (the destination register is always an input) and
so these were emulated.
There is a bug in the emulation of FMADD whereby if the destination
register is also a source for the multiplication it will be
clobbered. This doesn't break any assembly code in the std lib but
could affect future work.
To fix this I have gone through the floating point instructions and
removed all unnecessary 3-/4-operand emulation. The compiler doesn't
need it and assembly writers don't need it, it's just a source of
bugs.
I've also deleted the FNMADD family of emulated instructions. They
aren't used anywhere.
Change-Id: Ic07cedcf141a6a3b43a0c84895460f6cfbf56c04
Reviewed-on: https://go-review.googlesource.com/33350
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Unlike the pure go implementation used by every other architecture,
the amd64 asm implementation of Exp does not fail early if the
argument is known to overflow. Make it fail early.
Cost of the check is < 1ns (on an old Sandy Bridge machine):
name old time/op new time/op delta
Exp-4 18.3ns ± 1% 18.7ns ± 1% +2.08% (p=0.000 n=18+20)
Fixes#14932Fixes#18912
Change-Id: I04b3f9b4ee853822cbdc97feade726fbe2907289
Reviewed-on: https://go-review.googlesource.com/36271
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
There is some code value too: types intending to implement
Source64 can write a conversion confirming that.
For #4254 and the Go 1.8 release notes.
Change-Id: I7fc350a84f3a963e4dab317ad228fa340dda5c66
Reviewed-on: https://go-review.googlesource.com/33456
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
After x.ProbablyPrime(n) passes the n Miller-Rabin rounds,
add a Baillie-PSW test before declaring x probably prime.
Although the provable error bounds are unchanged, the empirical
error bounds drop dramatically: there are no known inputs
for which Baillie-PSW gives the wrong answer. For example,
before this CL, big.NewInt(443*1327).ProbablyPrime(1) == true.
Now it is (correctly) false.
The new Baillie-PSW test is two pieces: an added Miller-Rabin
round with base 2, and a so-called extra strong Lucas test.
(See the references listed in prime.go for more details.)
The Lucas test takes about 3.5x as long as the Miller-Rabin round,
which is close to theoretical expectations.
name time/op
ProbablyPrime/Lucas 2.91ms ± 2%
ProbablyPrime/MillerRabinBase2 850µs ± 1%
ProbablyPrime/n=0 3.75ms ± 3%
The speed of prime testing for a prime input does get slower:
name old time/op new time/op delta
ProbablyPrime/n=1 849µs ± 1% 4521µs ± 1% +432.31% (p=0.000 n=10+9)
ProbablyPrime/n=5 4.31ms ± 3% 7.87ms ± 1% +82.70% (p=0.000 n=10+10)
ProbablyPrime/n=10 8.52ms ± 3% 12.28ms ± 1% +44.11% (p=0.000 n=10+10)
ProbablyPrime/n=20 16.9ms ± 2% 21.4ms ± 2% +26.35% (p=0.000 n=9+10)
However, because the Baillie-PSW test is only added when the old
ProbablyPrime(n) would return true, testing composites runs at
the same speed as before, except in the case where the result
would have been incorrect and is now correct.
In particular, the most important use of this code is for
generating random primes in crypto/rand. That use spends
essentially all its time testing composites, so it is not
slowed down by the new Baillie-PSW check:
name old time/op new time/op delta
Prime 104ms ±22% 111ms ±16% ~ (p=0.165 n=10+10)
Thanks to Serhat Şevki Dinçer for CL 20170, which this CL builds on.
Fixes#13229.
Change-Id: Id26dde9b012c7637c85f2e96355d029b6382812a
Reviewed-on: https://go-review.googlesource.com/30770
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The tree is inconsistent about single l vs double l in those
words in documentation, test messages, and one error value text.
$ git grep -E '[Mm]arshall(|s|er|ers|ed|ing)' | wc -l
42
$ git grep -E '[Mm]arshal(|s|er|ers|ed|ing)' | wc -l
1694
Make it consistently a single l, per earlier decisions. This means
contributors won't be confused by misleading precedence, and it helps
consistency.
Change the spelling in one error value text in newRawAttributes of
crypto/x509 package to be consistent.
This change was generated with:
perl -i -npe 's,([Mm]arshal)l(|s|er|ers|ed|ing),$1$2,' $(git grep -l -E '[Mm]arshall' | grep -v AUTHORS | grep -v CONTRIBUTORS)
Updates #12431.
Follows https://golang.org/cl/14150.
Change-Id: I85d28a2d7692862ccb02d6a09f5d18538b6049a2
Reviewed-on: https://go-review.googlesource.com/33017
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Note, most math functions are structured to use stubs, so that they can
be accelerated with assembly on any platform.
Sinh, cosh, and tanh were not structued with stubs, so this CL does
that. This set of routines was chosen as likely to produce good speedups
with assembly on any platform.
Technique used was minimax polynomial approximation using tables of
polynomial coefficients, with argument range reduction.
A table of scaling factors was also used for cosh and log10.
before after speedup
BenchmarkCos 22.1 ns/op 6.79 ns/op 3.25x
BenchmarkCosh 125 ns/op 11.7 ns/op 10.68x
BenchmarkLog10 48.4 ns/op 12.5 ns/op 3.87x
BenchmarkSin 22.2 ns/op 6.55 ns/op 3.39x
BenchmarkSinh 125 ns/op 14.2 ns/op 8.80x
BenchmarkTanh 65.0 ns/op 15.1 ns/op 4.30x
Accuracy was tested against a high precision
reference function to determine maximum error.
Approximately 4,000,000 points were tested for each function,
producing the following result.
Note: ulperr is error in "units in the last place"
max
ulperr
sin 1.43 (returns NaN beyond +-2^50)
cos 1.79 (returns NaN beyond +-2^50)
cosh 1.05
sinh 3.02
tanh 3.69
log10 1.75
Also includes a set of tests to test non-vector functions even
when SIMD is enabled
Change-Id: Icb45f14d00864ee19ed973d209c3af21e4df4edc
Reviewed-on: https://go-review.googlesource.com/32352
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
The condition to determine if any further iterations are needed is
evaluated to false in case it encounters a NaN. Instead, flip the
condition to keep looping until the factor is greater than the machine
roundoff error.
Updates #17577
Change-Id: I058abe73fcd49d3ae4e2f7b33020437cc8f290c3
Reviewed-on: https://go-review.googlesource.com/31952
Reviewed-by: Robert Griesemer <gri@golang.org>
Implements Float.Scan which satisfies fmt.Scanner interface.
Also enforces docs' interface implementation claims with compile time
type assertions, that is:
+ Float always implements fmt.Formatter and fmt.Scanner
+ Int always implements fmt.Formatter and fmt.Scanner
+ Rat always implements fmt.Formatter
which will ensure that the API claims are strictly matched.
Also note that Float.Scan doesn't handle ±Inf.
Fixes#17391
Change-Id: I3d3dfbe7f602066975c7a7794fe25b4c645440ce
Reviewed-on: https://go-review.googlesource.com/30723
Reviewed-by: Robert Griesemer <gri@golang.org>
Add special case for Gamma(+∞) which speeds it up:
benchmark old ns/op new ns/op delta
BenchmarkGamma-4 14.5 7.44 -48.69%
The documentation for math.Gamma already specifies it as a special
case:
Gamma(+Inf) = +Inf
The original C code that has been used as the reference implementation
(as mentioned in the comments in gamma.go) also treats Gamma(+∞) as a
special case:
if( x == INFINITY )
return(x);
Change-Id: Idac36e19192b440475aec0796faa2d2c7f8abe0b
Reviewed-on: https://go-review.googlesource.com/31370
Reviewed-by: Robert Griesemer <gri@golang.org>
In addition to the DecimalConversion benchmark, that exercises the
String method of the internal decimal type on a range of small shifts,
add a few benchmarks for the big.Float String method. They can be used
to obtain more realistic data on the real-world performance of
big.Float printing.
Change-Id: I7ada324e7603cb1ce7492ccaf3382db0096223ba
Reviewed-on: https://go-review.googlesource.com/31275
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This is needed for some of the more complex primality tests
(to filter out exact squares), and while the code is simple the
boundary conditions are not obvious, so it seems worth having
in the library.
Change-Id: Ica994a6b6c1e412a6f6d9c3cf823f9b653c6bcbd
Reviewed-on: https://go-review.googlesource.com/30706
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Inspired by Alberto Donizetti's observations in
https://go-review.googlesource.com/#/c/30099/.
name old time/op new time/op delta
DecimalConversion-8 138µs ± 1% 136µs ± 2% -1.85% (p=0.000 n=10+10)
10 runs each, measured on a Mac Mini, 2.3 GHz Intel Core i7.
Performance improvements varied between -1.25% to -4.4%; -1.85% is
about in the middle of the observed improvement. The generated code
is slightly shorter in the inner loops of the conversion code.
Change-Id: I10fb3b2843da527691c39ad5e5e5bd37ed63e2fa
Reviewed-on: https://go-review.googlesource.com/31250
Reviewed-by: Alan Donovan <adonovan@google.com>
1. Define behavior for Unmarshal of JSON null into Unmarshaler and
TextUnmarshaler. Specifically, an Unmarshaler will be given the
literal null and can decide what to do (because otherwise
json.RawMessage is impossible to implement), and a TextUnmarshaler
will be skipped over (because there is no text to unmarshal), like
most other inappropriate types. Document this in Unmarshal, with a
reminder in UnmarshalJSON about handling null.
2. Test all this.
3. Fix the TextUnmarshaler case, which was returning an unmarshalling
error, to match the definition.
4. Fix the error that had been used for the TextUnmarshaler, since it
was claiming that there was a JSON string when in fact the problem was
NOT having a string.
5. Adjust time.Time and big.Int's UnmarshalJSON to ignore null, as is
conventional.
Fixes#9037.
Change-Id: If78350414eb8dda712867dc8f4ca35a9db041b0c
Reviewed-on: https://go-review.googlesource.com/30944
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
A later CL will be adding more code here.
It will help to keep it separate from the other code.
Change-Id: I971ba53de819cd10991b51fdec665984939a5f9b
Reviewed-on: https://go-review.googlesource.com/30709
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The Montgomery multiply code is applicable to this case
but was being bypassed. Don't do that.
The old test len(x) > 1 was really just a bad approximation to x > 1.
name old time/op new time/op delta
Exp-8 5.56ms ± 4% 5.73ms ± 3% ~ (p=0.095 n=5+5)
Exp2-8 7.59ms ± 1% 5.66ms ± 1% -25.40% (p=0.008 n=5+5)
This comes up especially when doing Fermat (Miller-Rabin)
primality tests with base 2.
Change-Id: I4cc02978db6dfa93f7f3c8f32718e25eedb4f5ed
Reviewed-on: https://go-review.googlesource.com/30708
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This way you can still run 'go test' or 'go bench -run Foo'
without wondering why it is taking so very long.
Change-Id: Icfa097a6deb1d6682acb7be9f34729215c29eabb
Reviewed-on: https://go-review.googlesource.com/30707
Reviewed-by: Robert Griesemer <gri@golang.org>
- Add new BenchmarkQuoRem.
- Eliminate allocation in divLarge nat pool
- Unroll mulAddVWW body 4x
- Remove some redundant slice loads in divLarge
name old time/op new time/op delta
QuoRem-8 2.18µs ± 1% 1.93µs ± 1% -11.38% (p=0.000 n=19+18)
The starting point in the comparison here is Cherry's
pending CL to turn mulWW and divWW into intrinsics.
The optimizations in divLarge work best because all
the function calls are gone. The effect of this CL is not
as large if you don't assume Cherry's CL.
Change-Id: Ia6138907489c5b9168497912e43705634e163b35
Reviewed-on: https://go-review.googlesource.com/30613
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Using 387 mode was computing it without underflow to zero,
apparently due to an 80-bit intermediate. Avoid underflow even
with 64-bit floats.
This eliminates the TODOs in the test suite.
Fixes linux-386-387 build and fixes#11441.
Change-Id: I8abaa63bfdf040438a95625d1cb61042f0302473
Reviewed-on: https://go-review.googlesource.com/30540
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The 387 implementation is less accurate and slower.
name old time/op new time/op delta
Exp-8 29.7ns ± 2% 24.0ns ± 2% -19.08% (p=0.000 n=10+10)
This makes Gamma more accurate too.
Change-Id: Iad33b9cce0b087ccbce3e08ba7a6d285c4999d02
Reviewed-on: https://go-review.googlesource.com/30230
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This adds Uint64 methods to Rand and rngSource.
Rand.Uint64 uses Source.Uint64 directly if it is present.
rngSource.Uint64 provides access to all 64 bits generated by the
underlying ALFG. To ensure high seed quality a 64th bit has been added
to all elements of the array of "cooked" random numbers that are used
for seeding. gen_cooked.go generates both the 63 bit and 64 bit array.
Fixes#4254
Change-Id: I22855618ac69abae3d2799b3e7e59996d4c5a4b1
Reviewed-on: https://go-review.googlesource.com/27253
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This makes function fit in 16 bytes, saving 16 bytes.
Change-Id: Iac5d2add42f6dae985b2a5cbe19ad4bd4bcc92ec
Reviewed-on: https://go-review.googlesource.com/29151
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This adds the instructions frim, frip, and friz to the ppc64x
assembler for use in implementing the math.Floor, math.Ceil, and
math.Trunc functions to improve performance.
Fixes#17185
BenchmarkCeil-128 21.4 6.99 -67.34%
BenchmarkFloor-128 13.9 6.37 -54.17%
BenchmarkTrunc-128 12.7 6.33 -50.16%
Change-Id: I96131bd4e8c9c8dbafb25bfeb544cf9d2dbb4282
Reviewed-on: https://go-review.googlesource.com/29654
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Also adds the 'find leftmost one' instruction (FLOGR) and replaces the
WORD-encoded use of FLOGR in math/big with it.
Change-Id: I18e7cd19e75b8501a6ae8bd925471f7e37ded206
Reviewed-on: https://go-review.googlesource.com/29372
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
No need to test so many sizes in race mode, especially for a package
which doesn't use goroutines.
Reduces test time from 2.5 minutes to 25 seconds.
Updates #17104
Change-Id: I7065b39273f82edece385c0d67b3f2d83d4934b8
Reviewed-on: https://go-review.googlesource.com/29163
Reviewed-by: David Crawshaw <crawshaw@golang.org>
The new SSA backend for s390x can use R0 as a general purpose register.
This change modifies assembly code to either avoid using R0 entirely
or explicitly set R0 to 0.
R0 can still be safely used as 0 in address calculations.
Change-Id: I3efa723e9ef322a91a408bd8c31768d7858526c8
Reviewed-on: https://go-review.googlesource.com/28976
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
1.7 introduced a significant regression compared to 1.6:
SqrtIndirect-4 2.32ns ± 0% 7.86ns ± 0% +238.79% (p=0.000 n=20+18)
This is caused by sqrtsd preserving upper part of destination register.
Which introduces dependency on previous value of X0.
In 1.6 benchmark loop didn't use X0 immediately after call:
callq *%rbx
movsd 0x8(%rsp),%xmm2
movsd 0x20(%rsp),%xmm1
addsd %xmm2,%xmm1
mov 0x18(%rsp),%rax
inc %rax
jmp loop
In 1.7 however xmm0 is used just after call:
callq *%rbx
mov 0x10(%rsp),%rcx
lea 0x1(%rcx),%rax
movsd 0x8(%rsp),%xmm0
movsd 0x18(%rsp),%xmm1
I've verified that this is caused by dependency, by inserting
XORPS X0,X0 in the beginning of math.Sqrt, which puts performance back on 1.6 level.
Splitting SQRTSD mem,reg into:
MOVSD mem,reg
SQRTSD reg,reg
Removes dependency, because MOVSD (load version)
doesn't need to preserve upper part of a register.
And reg,reg operation is solved by renamer in CPU.
As a result of this change regression is gone:
SqrtIndirect-4 7.86ns ± 0% 2.33ns ± 0% -70.36% (p=0.000 n=18+17)
This also removes old Sqrt benchmarks, in favor of benchmarks measuring latency.
Only SqrtIndirect is kept, to show impact of this patch.
Change-Id: Ic7eebe8866445adff5bc38192fa8d64c9a6b8872
Reviewed-on: https://go-review.googlesource.com/28392
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
While it was previously explicitly documented that "the default Source"
is safe for concurrent use, a careless reader can interpret that as
meaning "the implementation of the Source interface created by functions
in this package" rather than "the default shared Source used by
top-level functions". Be explicit that the Source returned by NewSource
is not safe for use by multiple goroutines.
Fixes#3611.
Change-Id: Iae4bc04c3887ad6e2491e36e38feda40324022c5
Reviewed-on: https://go-review.googlesource.com/25501
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The existing implementation used a pure go implementation, leading to slow
cryptographic performance.
Implemented mulWW, subVV, mulAddVWW, addMulVVW, and bitLen for
ppc64{le}.
Implemented divWW for ppc64le only, as the DIVDEU instruction is only
available on Power8 or newer.
benchcmp output:
benchmark old ns/op new ns/op delta
BenchmarkSignP384 28934360 10877330 -62.41%
BenchmarkRSA2048Decrypt 41261033 5139930 -87.54%
BenchmarkRSA2048Sign 45231300 7610985 -83.17%
Benchmark3PrimeRSA2048Decrypt 20487300 2481408 -87.89%
Fixes#16621
Change-Id: If8b68963bb49909bde832f2bda08a3791c4f5b7a
Reviewed-on: https://go-review.googlesource.com/26951
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Add missing function prototypes.
Fix function prototypes.
Use FP references instead of SP references.
Fix variable names.
Update comments.
Clean up whitespace. (Not for vet.)
All fairly minor fixes to make vet happy.
Updates #11041
Change-Id: Ifab2cdf235ff61cdc226ab1d84b8467b5ac9446c
Reviewed-on: https://go-review.googlesource.com/27713
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The Source provided by math/rand relies on an array of cooked
pseudo-random 63bit integers for seeding. The origin of these
numbers is undocumented.
Add a standalone program in math/rand folder that generates
the 63bit integer array as well as a 64bit version supporting
extension of the Source to 64bit pseudo-random number
generation while maintaining the current sequence in the
lower 63bit.
The code is largely based on the initial implementation of the
random number generator in the go repository by Ken Thompson
(revision 399).
Change-Id: Ib4192aea8127595027116a0f5a7be53f11dc110b
Reviewed-on: https://go-review.googlesource.com/22230
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
There are no synchronization points protecting the readVal and readPos
variables. This leads to a race when Read is called concurrently.
Fix this by adding methods to lockedSource, which is the case where
a race matters.
Fixes#16308.
Change-Id: Ic028909955700906b2d71e5c37c02da21b0f4ad9
Reviewed-on: https://go-review.googlesource.com/24852
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Do not throw away the rest of Int63 value used for
generation random bytes. Save it in Rand struct and
re-use during the next Read call.
Fixes#16124
Change-Id: Ic70bd80c3c3a6590e60ac615e8b3c2324589bea3
Reviewed-on: https://go-review.googlesource.com/24251
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Fixes#15788
Change-Id: I5a1fd1e5992f1c16cf8d8437d742bf02e1653b9c
Reviewed-on: https://go-review.googlesource.com/23461
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Previously, a 0 mantissa was special-cased during big.Float
parsing, but not during big.Rat parsing. This meant that a value
like 0e9999999999 would parse successfully in big.Float.SetString,
but would hang in big.Rat.SetString. This discrepancy became an
issue in https://golang.org/src/go/constant/value.go?#L250,
where the big.Float would report an exponent of 0, so
big.Rat.SetString would be used and would subsequently hang.
A Go Playground example of this is https://play.golang.org/p/3fy28eUJuF
The solution is to special-case a zero mantissa during big.Rat
parsing as well, so that neither big.Rat nor big.Float will hang when
parsing a value with 0 mantissa but a large exponent.
This was discovered using go-fuzz on CockroachDB:
https://github.com/cockroachdb/go-fuzz/blob/master/examples/parser/main.goFixes#16176
Change-Id: I775558a8682adbeba1cc9d20ba10f8ed26259c56
Reviewed-on: https://go-review.googlesource.com/24430
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
shortens code and gives an example of the use of Run.
Change-Id: I75ffaf762218a589274b4b62e19022e31e805d1b
Reviewed-on: https://go-review.googlesource.com/23424
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Document the fact that the default Source uses only
the bottom 31 bits of the given seed.
Fixes#15788
Change-Id: If20d1ec44a55c793a4a0a388f84b9392c2102bd1
Reviewed-on: https://go-review.googlesource.com/23352
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Follow-up to https://golang.org/cl/21755.
This turned out to be a bit more than just a few nits
as originally expected in that CL.
1) The actual mantissa may be shorter than required for the
given precision (because of trailing 0's): no need to
allocate space for it (and transmit 0's). This can save
a lot of space when the precision is high: E.g., for
prec == 1000, 16 words or 128 bytes are required at the
most, but if the actual number is short, it may be much
less (for the test cases present, it's significantly less).
2) The actual mantissa may be longer than the number of
words required for the given precision: make sure to
not overflow when encoding in bytes.
3) Add more documentation.
4) Add more tests.
Change-Id: I9f40c408cfdd9183a8e81076d2f7d6c75e7a00e9
Reviewed-on: https://go-review.googlesource.com/22324
Reviewed-by: Alan Donovan <adonovan@google.com>
Added GobEncode/Decode and a test for them.
Fixes#14593
Change-Id: Ic8d3efd24d0313a1a66f01da293c4c1fd39764a8
Reviewed-on: https://go-review.googlesource.com/21755
Reviewed-by: Robert Griesemer <gri@golang.org>
cmd and runtime were handled separately, and I'm intentionally skipped
syscall. This is the rest of the standard library.
CL generated mechanically with github.com/mdempsky/unconvert.
Change-Id: I9e0eff886974dedc37adb93f602064b83e469122
Reviewed-on: https://go-review.googlesource.com/22104
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Converting a big.Float value x to a float32/64 value did not correctly
round x up to the smallest denormal float32/64 if x was smaller than the
smallest denormal float32/64, but larger than 0.5 of a smallest denormal
float32/64.
Handle this case explicitly and simplify some code in the turn.
For #14651.
Change-Id: I025e24bf8f0e671581a7de0abf7c1cd7e6403a6c
Reviewed-on: https://go-review.googlesource.com/20816
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
'b' is a standard verb for floating point values. The runes like '+'
and '#' are called "flags" by package fmt's documentation. The flag
'-' controls left/right justification, not anything related to signs.
Change-Id: Ia9cf81b002df373f274ce635fe09b5bd0066aa1c
Reviewed-on: https://go-review.googlesource.com/20930
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The existing implementation uses code written in Go to
implement Sqrt; this adds the assembler to use the sqrt
instruction for Power and makes the necessary changes to
allow it to be inlined.
The following tests showed this relative improvement:
benchmark delta
BenchmarkSqrt -97.91%
BenchmarkSqrtIndirect -96.65%
BenchmarkSqrtGo -35.93%
BenchmarkSqrtPrime -96.94%
Fixes#14349
Change-Id: I8074f4dc63486e756587564ceb320aca300bf5fa
Reviewed-on: https://go-review.googlesource.com/19515
Reviewed-by: Minux Ma <minux@golang.org>
Adds a type of output to Examples that allows tests to have unordered
output. This is intended to help clarify when the output of a command
will produce a fixed return, but that return might not be in an constant
order.
Examples where this is useful would be documenting the rand.Perm()
call, or perhaps the (os.File).Readdir(), both of which can not guarantee
order, but can guarantee the elements of the output.
Fixes#10149
Change-Id: Iaf0cf1580b686afebd79718ed67ea744f5ed9fc5
Reviewed-on: https://go-review.googlesource.com/19280
Reviewed-by: Andrew Gerrand <adg@golang.org>
When a big.Float is converted to a denormal float32/64, the rounding
precision depends on the size of the denormal. Rounding may round up
and thus change the size (exponent) of the denormal. Recompute the
correct precision again for correct placement of the mantissa.
Fixes#14553.
Change-Id: Iedab5810a2d2a405cc5da28c6de7be34cb035b86
Reviewed-on: https://go-review.googlesource.com/20198
Reviewed-by: Alan Donovan <adonovan@google.com>
It appears to be a trivial dreg. Unreferenced. Gone.
Change-Id: I4a5ceed48e84254bc8a07fdb04487a18a0edf965
Reviewed-on: https://go-review.googlesource.com/20122
Run-TryBot: Rob Pike <r@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>