Still to do:
- composite types
- user-defined scanners
- format-driven scanning
The package comment will be updated when more of the functionality is in place.
R=rsc
CC=golang-dev
https://golang.org/cl/1252045
On my laptop, time to prepare and write 800x600 pixels over the
socket falls from 125-ish ms to 80-ish ms.
Thanks to Roger Peppe for the suggestion.
R=r
CC=golang-dev
https://golang.org/cl/1228044
Time to draw.Draw a 200x200 image fell from 18.4ms (and 1 malloc) to
5.6ms (and 0 mallocs). It's still relatively slow since it assumes
nothing about the src or mask images, but it does remove the malloc.
There are existing faster, more specialized paths for copies, fills
and image glyph masks.
Also added a "compare to a slow but obviously correct implementation"
check to draw_test.go.
R=rsc, r
CC=golang-dev
https://golang.org/cl/1223044
- implemented setWord, use it where setUint64 is wrong
- divLarge: use fast mulWW, divWW; implemented mulWW, divWW
- better assembly code for addMulVVW
R=rsc
CC=golang-dev
https://golang.org/cl/1258042
- support for binary prefix 0b (to match fmt.Format)
- renamed nat.new -> nat.setUint64 for consistency
- more tests
R=r
CC=golang-dev
https://golang.org/cl/1233041
Also update range of Phase and Polar due to signed zero.
[Phase(cmplx(-1, +0)) = pi and Phase(cmplx(-1, -0)) = -pi]
R=rsc, r
CC=golang-dev
https://golang.org/cl/1235041
Timings (as for change 1122043) go from 49ms to 48ms ish. It's
mostly lost in the noise, but it probably doesn't hurt.
R=r
CC=golang-dev
https://golang.org/cl/1179041
To draw.Draw a 32x32 image.Alpha 10000 times,
Before: 633ms with 10000 mallocs
After: 49ms with 0 mallocs
These times are just blitting an image.Alpha, and do not include
rasterizing a glyph's vector contours to an image.Alpha.
The "generic" test case in draw_test.go tests this fast path.
R=rsc
CC=golang-dev
https://golang.org/cl/1122043
- removed last argument (n) from all core arithmetic routines;
instead, use the length of the result
- simplified nat.make implementation and chose a better capacity
for new values, removed a TODO in the process
Changing the constant e from 1 (old) to 4 (new) improved
pidigits -s -n 10000 by ~9% (on a 3.06GHz Intel Core 2 Duo):
user 0m3.882s (old)
user 0m3.549s (new)
R=rsc
CC=golang-dev
https://golang.org/cl/1133043
This permits cgo callbacks to work when run in init code.
Otherwise cgocallback switches to the wrong stack address.
R=rsc
CC=golang-dev
https://golang.org/cl/1123043
- no need to make copies in cases of aliases
- removed deprecated internal shift functions
- minor unrelated simplifications
This change improves pidigits -s -n10000 by almost 20%:
user 0m6.156s (old)
user 0m4.999s (new)
(pidigits -s -n20000 goes from ~25s to ~19s)
R=rsc
CC=golang-dev
https://golang.org/cl/1149041
Because maps are mostly a hidden type, they must be
implemented using reflection values and will not be as
efficient as arrays and slices.
R=rsc
CC=golang-dev
https://golang.org/cl/1127041
- renamed Len -> BitLen, simplified implementation
- renamed old Div, Mod, DivMod -> Que, Rem, QuoRem
- implemented Div, Mod, DivMod (Euclidian definition, more
useful in a mathematical context)
- fixed a bug in Exp (-0 was possible)
- added extra tests to check normalized results everywhere
- uniformly set Int.neg flag at the end of computations
- minor cosmetic cleanups
- ran all tests
R=rsc
CC=golang-dev
https://golang.org/cl/1091041
Import _mulv from Inferno again, change R9 to R2.
Not sure what the other differences were for, but
they weren't working.
TBR=kaib
CC=golang-dev
https://golang.org/cl/1079041
When trying to regenerate src/pkg/runtime/darwin/386/defs.h
on a 64 bit capable Snow Leopard (OS X 10.6.3) system I
needed to add -f -m32 to godefs, as this OS and hardware
combination defaults to 64 bit compilation.
For safety, make the same change to the 32 bit FreeBSD
instructions in .../freebsd/defs.c. (Tested OK and no
problems introduced.)
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/1052042
- fixed a couple of bugs in the process
(shift right was incorrect for negative numbers)
- added more tests and made some tests more robust
- changed pidigits back to using shifts to multiply
by 2 instead of add
This improves pidigit -s -n 10000 by approx. 5%:
user 0m6.496s (old)
user 0m6.156s (new)
R=rsc
CC=golang-dev
https://golang.org/cl/963044
This results in an improvement of > 35% for the existing Mul benchmark
using the same karatsuba threshold, and an improvement of > 50% with
a slightly higher threshold (32 instead of 30):
big.BenchmarkMul 500 6731846 ns/op (old alg.)
big.BenchmarkMul 500 4351122 ns/op (new alg.)
big.BenchmarkMul 500 3133782 ns/op (new alg., new theshold)
Also:
- tweaked calibrate.go, use same benchmark as for Mul benchmark
R=rsc
CC=golang-dev
https://golang.org/cl/1037041
Plus:
- calibration "test" - include in tests with gotest -calibrate
- basic Mul benchmark
- extra multiplication tests
- various cleanups
This change improves multiplication speed of numbers >= 30 words
in length (current threshold; found empirically with calibrate):
The multiplication benchmark (multiplication of a variety of long numbers)
improves by ~35%, individual multiplies can be significantly faster.
gotest -benchmarks=Mul
big.BenchmarkMul 500 6829290 ns/op (w/ Karatsuba)
big.BenchmarkMul 100 10600760 ns/op
There's no impact on pidigits for -n=10000 or -n=20000
because the operands are are too small.
R=rsc
CC=golang-dev
https://golang.org/cl/1004042