We can't depend on init() order, and certainly we don't want to
register all future benchmarks that use jsonbytes or jsondata to init()
in json_test.go, so we use a more general solution: make generation of
jsonbytes and jsondata their own function so that the compiler will take
care of the order.
R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/6282046
Split stdout/stderr into a separate file so that can be handled
differently on some platforms. Both NetBSD and OpenBSD have defines
for stdout/stderr that require some coercion in order for cgo to
handle them correctly.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6247062
pipe2 is equivalent to pipe with flags set to 0.
However, pipe2 was only added recently. Using pipe
instead improves compatibility with NetBSD 5.
R=jsing
CC=golang-dev
https://golang.org/cl/6268045
specifically, adds a go-test element to compilation-error-regexp-alist[-alist].
Fixes#3629.
R=golang-dev, rsc, sameer
CC=golang-dev, jba
https://golang.org/cl/6197091
The cleanup also makes it ~5% faster, but that's
not the point of this CL.
Optimizations can come in future CLs.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6286043
It's very unfortunate that the type of Data field of struct
RawSockaddr is [14]uint8 on Linux/ARM instead of [14]int8
on all the others.
btw, it should be [14]int8 according to my header files.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6275050
Move address info flags to per-platform files. This is needed to
enable cgo on NetBSD (and later OpenBSD), as some of the currently
used AI_* defines do not exist on these platforms.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6250075
Use perfect cuckoo hash, to avoid binary search.
Define Atom bits as offset+len in long string instead
of enumeration, to avoid string headers.
Before: 1909 string bytes + 6060 tables = 7969 total data
After: 1406 string bytes + 2048 tables = 3454 total data
benchmark old ns/op new ns/op delta
BenchmarkLookup 83878 64681 -22.89%
R=nigeltao, r
CC=golang-dev
https://golang.org/cl/6262051
Ceil to 4.81 from 20.6 ns/op
Floor to 4.37 from 13.5 ns/op
Trunc to 3.97 from 14.3 ns/op
Also changed three MOVSDs to MOVAPDs in log_amd64.s
R=rsc, golang-dev
CC=golang-dev
https://golang.org/cl/6262048
Currently walk() doesn't check for err == SkipDir when iterating
a directory list, but such promise is made in the docs for WalkFunc.
Fixes#3486.
R=rsc, r
CC=golang-dev
https://golang.org/cl/6257059
Now that gri has made go/parser 15% faster, I offer this
change to slow back down cmd/api ~proportionately, adding
FreeBSD to the go1-checked set of platforms.
Really we should have done this earlier. This will prevent us
from breaking FreeBSD compatibility accidentally in the
future.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6279044
To avoid goroutines during init, the nextItem function was a
clever workaround. Now that init goroutines are permitted,
restore the original, simpler design.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6282043
- only compute current line position if needed
(i.e., if a comment is present)
- added benchmark
benchmark old ns/op new ns/op delta
BenchmarkParse 10902990 9313330 -14.58%
benchmark old MB/s new MB/s speedup
BenchmarkParse 5.31 6.22 1.17x
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6270043
Saving the code in case we improve things enough that
it matters later, but at least right now it is not worth doing.
R=ken2
CC=golang-dev
https://golang.org/cl/6248071
The previous code was preparing arrays of entries that would be
filled if there was one entry every 128 bytes. Moving to a 4096
byte interval reduces the overhead per megabyte of address space
to 2kB from 64kB (on 64-bit systems).
The performance impact will be negative for very small MemProfileRate.
test/bench/garbage/tree2 -heapsize 800000000 (default memprofilerate)
Before: mprof 65993056 bytes (1664 bucketmem + 65991392 addrmem)
After: mprof 1989984 bytes (1680 bucketmem + 1988304 addrmem)
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6257069
The previous heap profile format did not include buckets with
zero used bytes. Also add several missing MemStats fields in
debug mode.
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6249068
Drop expecttaken function in favor of extra argument
to gbranch and bgen. Mark loop condition as likely to
be true, so that loops are generated inline.
The main benefit here is contiguous code when trying
to read the generated assembly. It has only minor effects
on the timing, and they mostly cancel the minor effects
that aligning function entry points had. One exception:
both changes made Fannkuch faster.
Compared to before CL 6244066 (before aligned functions)
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4222117400 4201958800 -0.48%
BenchmarkFannkuch11 3462631800 3215908600 -7.13%
BenchmarkGobDecode 20887622 20899164 +0.06%
BenchmarkGobEncode 9548772 9439083 -1.15%
BenchmarkGzip 151687 152060 +0.25%
BenchmarkGunzip 8742 8711 -0.35%
BenchmarkJSONEncode 62730560 62686700 -0.07%
BenchmarkJSONDecode 252569180 252368960 -0.08%
BenchmarkMandelbrot200 5267599 5252531 -0.29%
BenchmarkRevcomp25M 980813500 985248400 +0.45%
BenchmarkTemplate 361259100 357414680 -1.06%
Compared to tip (aligned functions):
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4140739800 4201958800 +1.48%
BenchmarkFannkuch11 3259914400 3215908600 -1.35%
BenchmarkGobDecode 20620222 20899164 +1.35%
BenchmarkGobEncode 9384886 9439083 +0.58%
BenchmarkGzip 150333 152060 +1.15%
BenchmarkGunzip 8741 8711 -0.34%
BenchmarkJSONEncode 65210990 62686700 -3.87%
BenchmarkJSONDecode 249394860 252368960 +1.19%
BenchmarkMandelbrot200 5273394 5252531 -0.40%
BenchmarkRevcomp25M 996013800 985248400 -1.08%
BenchmarkTemplate 360620840 357414680 -0.89%
R=ken2
CC=golang-dev
https://golang.org/cl/6245069
On 6l and 8l, this is a real instruction, guaranteed to
cause an 'undefined instruction' exception.
On 5l, we simulate it as BL to address 0.
The plan is to use it as a signal to the linker that this
point in the instruction stream cannot be reached
(hence the changes to nofollow). This will help the
compiler explain that panicindex and friends do not
return without having to put a list of these functions
in the linker.
R=ken2
CC=golang-dev
https://golang.org/cl/6255064
16 seems pretty standard on x86 for function entry.
I don't know if ARM would benefit, so I used just 4
(single instruction alignment).
This has a minor absolute effect on the current timings.
The main hope is that it will make them more consistent from
run to run.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4222117400 4140739800 -1.93%
BenchmarkFannkuch11 3462631800 3259914400 -5.85%
BenchmarkGobDecode 20887622 20620222 -1.28%
BenchmarkGobEncode 9548772 9384886 -1.72%
BenchmarkGzip 151687 150333 -0.89%
BenchmarkGunzip 8742 8741 -0.01%
BenchmarkJSONEncode 62730560 65210990 +3.95%
BenchmarkJSONDecode 252569180 249394860 -1.26%
BenchmarkMandelbrot200 5267599 5273394 +0.11%
BenchmarkRevcomp25M 980813500 996013800 +1.55%
BenchmarkTemplate 361259100 360620840 -0.18%
R=ken2
CC=golang-dev
https://golang.org/cl/6244066