This bug has been introduced in the following revision:
changeset: 11404:26dceba5c610
user: Ivan Krasin <krasin@golang.org>
date: Mon Jan 23 09:19:39 2012 -0500
summary: compress/flate: reduce memory pressure at cost of additional arithmetic operation.
This is the review page for that CL: https://golang.org/cl/5555070/
R=rsc, imkrasin
CC=golang-dev
https://golang.org/cl/6249067
Most significant in mandelbrot, from avoiding MOVSD between registers,
but there are others.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6258063
Surprise! The C code is using floating point values for its counters.
Its off the critical path, but the Go code and C code are supposed to
be as similar as possible to make comparisons meaningful.
It doesn't have a significant effect.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6260058
This uses the patch output of gofmt (-d option) and applies each
chunk to the buffer, instead of replacing the whole buffer. The
main advantage is that the undo history is kept across gofmt'ings,
so it can really be used as a before-save-hook.
R=sameer, sameer
CC=golang-dev
https://golang.org/cl/6198047
The correct procid is needed for unparking LWPs on NetBSD - always
initialise procid in minit() so that cgo works correctly. The non-cgo
case already works correctly since procid is initialised via
lwp_create().
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6257071
On NetBSD a cgo enabled binary has more than 32 sections - bump NSECTS
so that we can actually link them successfully.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6261052
filterPaeth takes []byte arguments instead of byte arguments,
which avoids some redudant computation of the previous pixel
in the inner loop.
Also eliminate a bounds check in decoding the up filter.
benchmark old ns/op new ns/op delta
BenchmarkDecodeGray 3139636 2812531 -10.42%
BenchmarkDecodeNRGBAGradient 12341520 10971680 -11.10%
BenchmarkDecodeNRGBAOpaque 10740780 9612455 -10.51%
BenchmarkDecodePaletted 1819535 1818913 -0.03%
BenchmarkDecodeRGB 8974695 8178070 -8.88%
R=rsc
CC=golang-dev
https://golang.org/cl/6243061
A block with finalizer might also be profiled. The special bit
is needed to unregister the block from the profile. It will be
unset only when the block is freed.
Fixes#3668.
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6249066
The check for Stringer etc. can only fire if the test is not a builtin, so avoid
the expensive check if we know there's no chance.
Also put in a fast path for pad, which saves a more modest amount.
benchmark old ns/op new ns/op delta
BenchmarkSprintfEmpty 148 152 +2.70%
BenchmarkSprintfString 585 497 -15.04%
BenchmarkSprintfInt 441 396 -10.20%
BenchmarkSprintfIntInt 718 603 -16.02%
BenchmarkSprintfPrefixedInt 676 621 -8.14%
BenchmarkSprintfFloat 1003 953 -4.99%
BenchmarkManyArgs 2945 2312 -21.49%
BenchmarkScanInts 1704152 1734441 +1.78%
BenchmarkScanRecursiveInt 1837397 1828920 -0.46%
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6245068
Two fixes for indentation problems:
1. Properly recognize multi-line strings. These start with `, not ".
2. Don't indent a line if the beginning of the line is the end of a multi-line string. This happened for instance when inserting a closing bracket after a multi-line string.
R=sameer
CC=golang-dev
https://golang.org/cl/6157044
This prevents clients from seeing RSTs and missing the response
body.
TCP stacks vary. The included test failed on Darwin before but
passed on Linux.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6256066
It was only being used for (*Stmt).Exec, not Query, and not for
the same two methods on *DB.
This unifies (*Stmt).Exec's old inline code into the old
subsetArgs function, renaming it in the process (changing the
old word "subset" to "driver", mostly converted earlier)
Fixes#3640
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6258045
It's sad to introduce a new macro, but rnd shows up consistently
in profiles, and the function call overwhelms the two arithmetic
instructions it performs.
R=r
CC=golang-dev
https://golang.org/cl/6260051
Moving panic out of line speeds up fannkuch almost a factor of two.
Changes to bitwhacking code affect mandelbrot badly.
R=golang-dev, bradfitz, rsc, r
CC=golang-dev
https://golang.org/cl/6258056
Plan 9 versions for amd64 have 2 megabyte pages.
This also fixes the logic for 32-bit vs 64-bit Plan 9,
making 64-bit the default, and adds logic to generate
a symbols table.
R=golang-dev, rsc, rminnich, ality, 0intro
CC=golang-dev, john
https://golang.org/cl/6218046
The old code generated for a bounds check was
CMP
JLT ok
CALL panicindex
ok:
...
The new code is (once the linker finishes with it):
CMP
JGE panic
...
panic:
CALL panicindex
which moves the calls out of line, putting more useful
code in each cache line. This matters especially in tight
loops, such as in Fannkuch. The benefit is more modest
elsewhere, but real.
From test/bench/go1, amd64:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 6096092000 6088808000 -0.12%
BenchmarkFannkuch11 6151404000 4020463000 -34.64%
BenchmarkGobDecode 28990050 28894630 -0.33%
BenchmarkGobEncode 12406310 12136730 -2.17%
BenchmarkGzip 179923 179903 -0.01%
BenchmarkGunzip 11219 11130 -0.79%
BenchmarkJSONEncode 86429350 86515900 +0.10%
BenchmarkJSONDecode 334593800 315728400 -5.64%
BenchmarkRevcomp25M 1219763000 1180767000 -3.20%
BenchmarkTemplate 492947600 483646800 -1.89%
And 386:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 6354902000 6243000000 -1.76%
BenchmarkFannkuch11 8043769000 7326965000 -8.91%
BenchmarkGobDecode 19010800 18941230 -0.37%
BenchmarkGobEncode 14077500 13792460 -2.02%
BenchmarkGzip 194087 193619 -0.24%
BenchmarkGunzip 12495 12457 -0.30%
BenchmarkJSONEncode 125636400 125451400 -0.15%
BenchmarkJSONDecode 696648600 685032800 -1.67%
BenchmarkRevcomp25M 2058088000 2052545000 -0.27%
BenchmarkTemplate 602140000 589876800 -2.04%
To implement this, two new instruction forms:
JLT target // same as always
JLT $0, target // branch expected not taken
JLT $1, target // branch expected taken
The linker could also emit the prediction prefixes, but it
does not: expected taken branches are reversed so that the
expected case is not taken (as in example above), and
the default expectaton for such a jump is not taken
already.
R=golang-dev, gri, r, dave
CC=golang-dev
https://golang.org/cl/6248049
Implement the (3-per-family) Noah's Ark clause (i.e. don't put
more than three identical elements on the list of active formatting
elements.
Also, when running tests, sort attributes by name before dumping
them.
Pass 4 additional tests with Noah's Ark clause (including one
that needs attributes to be sorted).
Pass 5 additional, unrelated tests because of sorting attributes.
R=nigeltao, rsc
CC=golang-dev
https://golang.org/cl/6247056
CanonicalHeaderKey didn't allocate, but it did use unnecessary
CPU in the hot path, deciding it didn't need to allocate.
I considered using constants for all these common header keys
but I didn't think it would be prettier. "Content-Length" looks
better than contentLength or hdrContentLength, etc.
R=golang-dev, dave
CC=golang-dev
https://golang.org/cl/6255053