Flame motivated me to get around to adding extended key usage support
so that code signing certificates can't be used for TLS server
authentication and vice versa.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6304065
if we were to use sizeof(sa.sa_mask) instead of 8 as the last argument
to rt_sigaction, we would have already fixed this bug, so also updated
Linux/386 and Linux/amd64 files to use that; also test the return value
of rt_sigaction.
R=dave, rsc
CC=golang-dev
https://golang.org/cl/6297087
If the server replies with an HTTP response before we're done
writing our body (for instance "401 Unauthorized" response), we
were previously ignoring that, since we returned our write
error ("broken pipe", etc) before ever reading the response.
Now we read and write at the same time.
Fixes#3595
R=rsc, adg
CC=golang-dev
https://golang.org/cl/6238043
Also, fixes one violation found during testing where both
response and error could be non-nil when a CheckRedirect test
failed. This is arguably a minor API (behavior, not
signature) change, but it wasn't documented either way and was
inconsistent & non-Go like. Any code depending on the old
behavior was wrong anyway.
R=adg, rsc
CC=golang-dev
https://golang.org/cl/6307088
A comment to that effect was introduced
with rev d332f4b9cef5 but the respective
code wasn't deleted.
R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/6304086
Fixes a situation where a nested bad type would still
permit the outer type to install a working engine, leading
to inconsistent behavior.
Fixes#3273.
R=bsiegert, rsc
CC=golang-dev
https://golang.org/cl/6294067
A comment map associates comments with AST nodes
and permits correct updating of the AST's comment
list when the AST is manipulated.
R=rsc
CC=golang-dev
https://golang.org/cl/6281044
This slipped in with the implementation of Getpid in CL 5909043.
I'd exclude that CL entirely but it is tangled up in the Win32finddata changes.
R=golang-dev, minux.ma
CC=golang-dev
https://golang.org/cl/6297065
Preserve old API by using correct struct in system call
and then copying the results, as we did for SetsockoptLinger.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6307065
This new error is the only API change in the current draft of
Go 1.0.2 CLs. I'd like to include the CL that introduced it,
because it replaces a mysterious 'internal error' with a
useful error message, but I don't want any API changes,
so unexport the error constant for now. It can be
re-exported for Go 1.1.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6294055
Fixes#3708.
The fix to allow 5{c,g,l} to compile under clang 3.1 broke cross
compilation on darwin using the Apple default compiler on 10.7.3.
This failure was introduced in 9b455eb64690.
This has been tested by cross compiling on darwin/amd64 to linux/arm using
* gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.1.00)
* clang version 3.1 (branches/release_31)
As well as on linux/arm using
* gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
* Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
* Debian clang version 3.1-4 (branches/release_31) (based on LLVM 3.1)
R=consalus, rsc
CC=golang-dev
https://golang.org/cl/6307058
The type declarations were being generated using
a range over a map, which meant that successive
runs produced different orders. This will make sure
successive runs produce the same files.
Fixes#3707.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6300062
If there are mutually recursive functions, there is a cycle in
the dependency graph, so the order is actually dependency order
among the strongly connected components: mutually recursive
functions get put into the same batch and analyzed together.
(Until now the entire package was put in one batch.)
The non-recursive case (single function, maybe with some
closures inside) will be able to be more precise about inputs
that escape only back to outputs, but that is not implemented yet.
R=ken2
CC=golang-dev, lvd
https://golang.org/cl/6304050
CL 4313064 fixed its test case but did not address a
general enough problem:
type T1 struct { F *T2 }
type T2 T1
type T3 T2
could still end up copying the definition of T1 for T2
before T1 was done being evaluated, or T3 before T2
was done.
In order to propagate the updates correctly,
record a copy of an incomplete type for re-execution
once the type is completed. Roll back CL 4313064.
Fixes#3709.
R=ken2
CC=golang-dev, lstoakes
https://golang.org/cl/6301059
This is part 1 of a 2 part changelist. Part 2 contains the mechanical
change to parse.go to compare atoms (ints) instead of strings.
The overall effect of the two changes are:
benchmark old ns/op new ns/op delta
BenchmarkParser 4462274 4058254 -9.05%
BenchmarkRawLevelTokenizer 913202 912917 -0.03%
BenchmarkLowLevelTokenizer 1268626 1267836 -0.06%
BenchmarkHighLevelTokenizer 1947305 1968944 +1.11%
R=rsc
CC=andybalholm, golang-dev, r
https://golang.org/cl/6305053
The reordering speedup in CL 6245068 changed the semantics
of %#v by delaying the clearing of some flags. Restore the old
semantics and add a test.
Fixes#3706.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6302048
CL 6250075 removed AI_MASK mask on all BSD variants,
however FreeBSD's AI_MASK does not include AI_V4MAPPED
and AI_ALL, and its libc is strict about the ai_flags.
This will fix the FreeBSD builder.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6305054
On netbsd/386, tv_sec is a 64-bit integer for both timeval and timespec.
Fix the time handling code so that it works correctly.
R=golang-dev, rsc, m4dh4tt3r
CC=golang-dev
https://golang.org/cl/6256056
Thanks to Dave Cheney for the magic words "comm page".
benchmark old ns/op new ns/op delta
BenchmarkNow 197 33 -83.05%
This should make profiling a little better on OS X.
The raw time saved is unlikely to matter: what likely matters
more is that it seems like OS X sends profiling signals on the
way out of system calls more often than it should; avoiding
the system call should increase the accuracy of cpu profiles.
The 386 version would be similar but needs to do different
math for CPU speeds less than 1 GHz. (Apparently Apple has
never shipped a 64-bit CPU with such a slow clock.)
R=golang-dev, bradfitz, dave, minux.ma, r
CC=golang-dev
https://golang.org/cl/6275056
amd64 was done in CL 6275056.
We don't attempt to handle machines with clock speeds
less than 1 GHz. Those will fall back to the system call.
benchmark old ns/op new ns/op delta
BenchmarkNow 364 38 -89.53%
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6307045
Using an int64 for a block size doesn't make
sense on 32bit platforms but extracts a performance
penalty dealing with double word quantities on Arm.
linux/arm
benchmark old ns/op new ns/op delta
BenchmarkGobDecode 155401600 144589300 -6.96%
BenchmarkGobEncode 72772220 62460940 -14.17%
BenchmarkGzip 5822632 2604797 -55.26%
BenchmarkGunzip 326321 151721 -53.51%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 4.94 5.31 1.07x
BenchmarkGobEncode 10.55 12.29 1.16x
R=golang-dev, rsc, bradfitz
CC=golang-dev
https://golang.org/cl/6272047
The original implementation of closures created the
underlying top-level function during walk, which is fairly
late in the compilation process and caused ordering-based
complications due to earlier stages that had to be repeated
any number of times.
Create the underlying function during typecheck, much
earlier, so that later stages can be run just once.
The result is a simpler compilation sequence.
R=ken2
CC=golang-dev
https://golang.org/cl/6279049
although the comment says it uses libc's getenv, without NOPLAN9DEFINES
it actually uses p9getenv which strdups.
R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/6285046
The recent shuffle in parsing formats exposed probably unintentional
behavior in time.Parse, namely that it was mostly ignoring ".99999"
in the format, producing the following behavior:
fmt.Println(time.Parse("03:04:05.999 MST", "12:00:00.888 PDT")) // error (.888 unexpected)
fmt.Println(time.Parse("03:04:05.999", "12:00:00")) // error (input too short)
fmt.Println(time.Parse("03:04:05.999 MST", "12:00:00 PDT")) // ok (extra bytes on input make it ok)
http://play.golang.org/p/ESJ1UYXzq2
API CHANGE:
This CL makes all three examples valid: ".999" can match an
empty string or else a fractional second with at most nine digits.
Fixes#3701.
R=r, r
CC=golang-dev
https://golang.org/cl/6267045
An attempt to profit from CL 6176043 (fix to superpolinomial
runtime of karatsuba multiplication) and determine a better
karatsuba threshold. The result indicates that 32 is still
a reasonable value. Left the threshold as is (== 32), but
made some minor changes to the calibrate code which are
worthwhile saving (use of existing benchmarking code for
better results, better use of package time).
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6260062
pipe2 is equivalent to pipe with flags set to 0.
However, pipe2 was only added recently. Using pipe
instead improves compatibility with NetBSD 5.
R=jsing
CC=golang-dev
https://golang.org/cl/6268045
The cleanup also makes it ~5% faster, but that's
not the point of this CL.
Optimizations can come in future CLs.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6286043
It's very unfortunate that the type of Data field of struct
RawSockaddr is [14]uint8 on Linux/ARM instead of [14]int8
on all the others.
btw, it should be [14]int8 according to my header files.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6275050
Move address info flags to per-platform files. This is needed to
enable cgo on NetBSD (and later OpenBSD), as some of the currently
used AI_* defines do not exist on these platforms.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6250075
Use perfect cuckoo hash, to avoid binary search.
Define Atom bits as offset+len in long string instead
of enumeration, to avoid string headers.
Before: 1909 string bytes + 6060 tables = 7969 total data
After: 1406 string bytes + 2048 tables = 3454 total data
benchmark old ns/op new ns/op delta
BenchmarkLookup 83878 64681 -22.89%
R=nigeltao, r
CC=golang-dev
https://golang.org/cl/6262051
Ceil to 4.81 from 20.6 ns/op
Floor to 4.37 from 13.5 ns/op
Trunc to 3.97 from 14.3 ns/op
Also changed three MOVSDs to MOVAPDs in log_amd64.s
R=rsc, golang-dev
CC=golang-dev
https://golang.org/cl/6262048
Currently walk() doesn't check for err == SkipDir when iterating
a directory list, but such promise is made in the docs for WalkFunc.
Fixes#3486.
R=rsc, r
CC=golang-dev
https://golang.org/cl/6257059
Now that gri has made go/parser 15% faster, I offer this
change to slow back down cmd/api ~proportionately, adding
FreeBSD to the go1-checked set of platforms.
Really we should have done this earlier. This will prevent us
from breaking FreeBSD compatibility accidentally in the
future.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6279044
To avoid goroutines during init, the nextItem function was a
clever workaround. Now that init goroutines are permitted,
restore the original, simpler design.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6282043
- only compute current line position if needed
(i.e., if a comment is present)
- added benchmark
benchmark old ns/op new ns/op delta
BenchmarkParse 10902990 9313330 -14.58%
benchmark old MB/s new MB/s speedup
BenchmarkParse 5.31 6.22 1.17x
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6270043
Saving the code in case we improve things enough that
it matters later, but at least right now it is not worth doing.
R=ken2
CC=golang-dev
https://golang.org/cl/6248071
The previous code was preparing arrays of entries that would be
filled if there was one entry every 128 bytes. Moving to a 4096
byte interval reduces the overhead per megabyte of address space
to 2kB from 64kB (on 64-bit systems).
The performance impact will be negative for very small MemProfileRate.
test/bench/garbage/tree2 -heapsize 800000000 (default memprofilerate)
Before: mprof 65993056 bytes (1664 bucketmem + 65991392 addrmem)
After: mprof 1989984 bytes (1680 bucketmem + 1988304 addrmem)
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6257069
The previous heap profile format did not include buckets with
zero used bytes. Also add several missing MemStats fields in
debug mode.
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6249068
Drop expecttaken function in favor of extra argument
to gbranch and bgen. Mark loop condition as likely to
be true, so that loops are generated inline.
The main benefit here is contiguous code when trying
to read the generated assembly. It has only minor effects
on the timing, and they mostly cancel the minor effects
that aligning function entry points had. One exception:
both changes made Fannkuch faster.
Compared to before CL 6244066 (before aligned functions)
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4222117400 4201958800 -0.48%
BenchmarkFannkuch11 3462631800 3215908600 -7.13%
BenchmarkGobDecode 20887622 20899164 +0.06%
BenchmarkGobEncode 9548772 9439083 -1.15%
BenchmarkGzip 151687 152060 +0.25%
BenchmarkGunzip 8742 8711 -0.35%
BenchmarkJSONEncode 62730560 62686700 -0.07%
BenchmarkJSONDecode 252569180 252368960 -0.08%
BenchmarkMandelbrot200 5267599 5252531 -0.29%
BenchmarkRevcomp25M 980813500 985248400 +0.45%
BenchmarkTemplate 361259100 357414680 -1.06%
Compared to tip (aligned functions):
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4140739800 4201958800 +1.48%
BenchmarkFannkuch11 3259914400 3215908600 -1.35%
BenchmarkGobDecode 20620222 20899164 +1.35%
BenchmarkGobEncode 9384886 9439083 +0.58%
BenchmarkGzip 150333 152060 +1.15%
BenchmarkGunzip 8741 8711 -0.34%
BenchmarkJSONEncode 65210990 62686700 -3.87%
BenchmarkJSONDecode 249394860 252368960 +1.19%
BenchmarkMandelbrot200 5273394 5252531 -0.40%
BenchmarkRevcomp25M 996013800 985248400 -1.08%
BenchmarkTemplate 360620840 357414680 -0.89%
R=ken2
CC=golang-dev
https://golang.org/cl/6245069
On 6l and 8l, this is a real instruction, guaranteed to
cause an 'undefined instruction' exception.
On 5l, we simulate it as BL to address 0.
The plan is to use it as a signal to the linker that this
point in the instruction stream cannot be reached
(hence the changes to nofollow). This will help the
compiler explain that panicindex and friends do not
return without having to put a list of these functions
in the linker.
R=ken2
CC=golang-dev
https://golang.org/cl/6255064
16 seems pretty standard on x86 for function entry.
I don't know if ARM would benefit, so I used just 4
(single instruction alignment).
This has a minor absolute effect on the current timings.
The main hope is that it will make them more consistent from
run to run.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4222117400 4140739800 -1.93%
BenchmarkFannkuch11 3462631800 3259914400 -5.85%
BenchmarkGobDecode 20887622 20620222 -1.28%
BenchmarkGobEncode 9548772 9384886 -1.72%
BenchmarkGzip 151687 150333 -0.89%
BenchmarkGunzip 8742 8741 -0.01%
BenchmarkJSONEncode 62730560 65210990 +3.95%
BenchmarkJSONDecode 252569180 249394860 -1.26%
BenchmarkMandelbrot200 5267599 5273394 +0.11%
BenchmarkRevcomp25M 980813500 996013800 +1.55%
BenchmarkTemplate 361259100 360620840 -0.18%
R=ken2
CC=golang-dev
https://golang.org/cl/6244066
The code was inconsistent about when it used
brchain(x) and when it used x directly, with the result
that you could end up emitting code for brchain(x) but
leave the jump pointing at an unemitted x.
R=ken2
CC=golang-dev
https://golang.org/cl/6250077
This bug has been introduced in the following revision:
changeset: 11404:26dceba5c610
user: Ivan Krasin <krasin@golang.org>
date: Mon Jan 23 09:19:39 2012 -0500
summary: compress/flate: reduce memory pressure at cost of additional arithmetic operation.
This is the review page for that CL: https://golang.org/cl/5555070/
R=rsc, imkrasin
CC=golang-dev
https://golang.org/cl/6249067
The correct procid is needed for unparking LWPs on NetBSD - always
initialise procid in minit() so that cgo works correctly. The non-cgo
case already works correctly since procid is initialised via
lwp_create().
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6257071
On NetBSD a cgo enabled binary has more than 32 sections - bump NSECTS
so that we can actually link them successfully.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6261052
filterPaeth takes []byte arguments instead of byte arguments,
which avoids some redudant computation of the previous pixel
in the inner loop.
Also eliminate a bounds check in decoding the up filter.
benchmark old ns/op new ns/op delta
BenchmarkDecodeGray 3139636 2812531 -10.42%
BenchmarkDecodeNRGBAGradient 12341520 10971680 -11.10%
BenchmarkDecodeNRGBAOpaque 10740780 9612455 -10.51%
BenchmarkDecodePaletted 1819535 1818913 -0.03%
BenchmarkDecodeRGB 8974695 8178070 -8.88%
R=rsc
CC=golang-dev
https://golang.org/cl/6243061
A block with finalizer might also be profiled. The special bit
is needed to unregister the block from the profile. It will be
unset only when the block is freed.
Fixes#3668.
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6249066
The check for Stringer etc. can only fire if the test is not a builtin, so avoid
the expensive check if we know there's no chance.
Also put in a fast path for pad, which saves a more modest amount.
benchmark old ns/op new ns/op delta
BenchmarkSprintfEmpty 148 152 +2.70%
BenchmarkSprintfString 585 497 -15.04%
BenchmarkSprintfInt 441 396 -10.20%
BenchmarkSprintfIntInt 718 603 -16.02%
BenchmarkSprintfPrefixedInt 676 621 -8.14%
BenchmarkSprintfFloat 1003 953 -4.99%
BenchmarkManyArgs 2945 2312 -21.49%
BenchmarkScanInts 1704152 1734441 +1.78%
BenchmarkScanRecursiveInt 1837397 1828920 -0.46%
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6245068
This prevents clients from seeing RSTs and missing the response
body.
TCP stacks vary. The included test failed on Darwin before but
passed on Linux.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6256066
It was only being used for (*Stmt).Exec, not Query, and not for
the same two methods on *DB.
This unifies (*Stmt).Exec's old inline code into the old
subsetArgs function, renaming it in the process (changing the
old word "subset" to "driver", mostly converted earlier)
Fixes#3640
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6258045
It's sad to introduce a new macro, but rnd shows up consistently
in profiles, and the function call overwhelms the two arithmetic
instructions it performs.
R=r
CC=golang-dev
https://golang.org/cl/6260051
Plan 9 versions for amd64 have 2 megabyte pages.
This also fixes the logic for 32-bit vs 64-bit Plan 9,
making 64-bit the default, and adds logic to generate
a symbols table.
R=golang-dev, rsc, rminnich, ality, 0intro
CC=golang-dev, john
https://golang.org/cl/6218046
The old code generated for a bounds check was
CMP
JLT ok
CALL panicindex
ok:
...
The new code is (once the linker finishes with it):
CMP
JGE panic
...
panic:
CALL panicindex
which moves the calls out of line, putting more useful
code in each cache line. This matters especially in tight
loops, such as in Fannkuch. The benefit is more modest
elsewhere, but real.
From test/bench/go1, amd64:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 6096092000 6088808000 -0.12%
BenchmarkFannkuch11 6151404000 4020463000 -34.64%
BenchmarkGobDecode 28990050 28894630 -0.33%
BenchmarkGobEncode 12406310 12136730 -2.17%
BenchmarkGzip 179923 179903 -0.01%
BenchmarkGunzip 11219 11130 -0.79%
BenchmarkJSONEncode 86429350 86515900 +0.10%
BenchmarkJSONDecode 334593800 315728400 -5.64%
BenchmarkRevcomp25M 1219763000 1180767000 -3.20%
BenchmarkTemplate 492947600 483646800 -1.89%
And 386:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 6354902000 6243000000 -1.76%
BenchmarkFannkuch11 8043769000 7326965000 -8.91%
BenchmarkGobDecode 19010800 18941230 -0.37%
BenchmarkGobEncode 14077500 13792460 -2.02%
BenchmarkGzip 194087 193619 -0.24%
BenchmarkGunzip 12495 12457 -0.30%
BenchmarkJSONEncode 125636400 125451400 -0.15%
BenchmarkJSONDecode 696648600 685032800 -1.67%
BenchmarkRevcomp25M 2058088000 2052545000 -0.27%
BenchmarkTemplate 602140000 589876800 -2.04%
To implement this, two new instruction forms:
JLT target // same as always
JLT $0, target // branch expected not taken
JLT $1, target // branch expected taken
The linker could also emit the prediction prefixes, but it
does not: expected taken branches are reversed so that the
expected case is not taken (as in example above), and
the default expectaton for such a jump is not taken
already.
R=golang-dev, gri, r, dave
CC=golang-dev
https://golang.org/cl/6248049
Implement the (3-per-family) Noah's Ark clause (i.e. don't put
more than three identical elements on the list of active formatting
elements.
Also, when running tests, sort attributes by name before dumping
them.
Pass 4 additional tests with Noah's Ark clause (including one
that needs attributes to be sorted).
Pass 5 additional, unrelated tests because of sorting attributes.
R=nigeltao, rsc
CC=golang-dev
https://golang.org/cl/6247056