Broke build; missing deps_test change. Will re-send the original with the appropriate fix.
««« original CL description
net/rpc: use html/template to render html
Found using the vet check in CL 106370045.
LGTM=r
R=golang-codereviews, r
CC=golang-codereviews
https://golang.org/cl/101670044
»»»
TBR=r
CC=golang-codereviews
https://golang.org/cl/110880044
The code generating the .debug_frame section emits pairs of "advance PC",
"set SP offset" pseudo-instructions. Before the fix, the PC advance comes
out before the SP setting, which means the emitted offset for a block is
actually the value at the end of the block, which is incorrect for the
block itself.
The easiest way to fix this problem is to emit the SP offset before the
PC advance.
One delicate point: the last instruction to come out is now an
"advance PC", which means that if there are padding intsructions after
the final RET, they will appear to have a non-zero offset. This is odd
but harmless because there is no legal way to have a PC in that range,
or to put it another way, if you get here the SP is certainly screwed up
so getting the wrong (virtual) frame pointer is the least of your worries.
LGTM=iant
R=rsc, iant, lvd
CC=golang-codereviews
https://golang.org/cl/112750043
Both stdout and stderr are sent to /dev/null in android
apps. Introducing fatalf allows android to implement its
own copy that sends fatal errors to __android_log_print.
LGTM=minux, dave
R=minux, dave
CC=golang-codereviews
https://golang.org/cl/108400045
Based on cl/69170045 by Elias Naur.
There are currently several schemes for acquiring a TLS
slot to save the g register. None of them appear to work
for android. The closest are linux and darwin.
Linux uses a linker TLS relocation. This is not supported
by the android linker.
Darwin uses a fixed offset, and calls pthread_key_create
until it gets the slot it wants. As the runtime loads
late in the android process lifecycle, after an
arbitrary number of other libraries, we cannot rely on
any particular slot being available.
So we call pthread_key_create, take the first slot we are
given, and put it in runtime.tlsg, which we turn into a
regular variable in cmd/ld.
Makes android/arm cgo binaries work.
LGTM=minux
R=elias.naur, minux, dave, josharian
CC=golang-codereviews
https://golang.org/cl/106380043
The code in GC that handles gp->gobuf.ctxt is wrong,
because it does not mark the ctxt object itself,
if just queues the ctxt object for scanning.
So the ctxt object can be collected as garbage.
However, Gobuf.ctxt is void*, so it's always marked and
scanned through G.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, khr, rsc
https://golang.org/cl/105490044
runtime·usleep and runtime·osyield fall back to calling an
assembly wrapper for the libc functions in the absence of a m,
so they can be called in cgo callback context.
LGTM=rsc
R=minux.ma, rsc
CC=dave, golang-codereviews
https://golang.org/cl/102620044
A temporary 512 bytes buffer is allocated for every call to
readHeader. This buffer isn't returned to the caller and it could
be reused to lower the number of memory allocations.
This CL improves it by using a pool and zeroing out the buffer before
putting it back into the pool.
benchmark old ns/op new ns/op delta
BenchmarkListFiles100k 545249903 538832687 -1.18%
benchmark old allocs new allocs delta
BenchmarkListFiles100k 2105167 2005692 -4.73%
benchmark old bytes new bytes delta
BenchmarkListFiles100k 105903472 54831527 -48.22%
This improvement is very important if your code has to deal with a lot
of tarballs which contain a lot of files.
LGTM=dsymonds
R=golang-codereviews, dave, dsymonds, bradfitz
CC=golang-codereviews
https://golang.org/cl/108240044
A temporary 512 bytes buffer is allocated for every call to
writeHeader. This buffer could be reused the lower the number
of memory allocations.
benchmark old ns/op new ns/op delta
BenchmarkWriteFiles100k 634622051 583810847 -8.01%
benchmark old allocs new allocs delta
BenchmarkWriteFiles100k 2701920 2602621 -3.68%
benchmark old bytes new bytes delta
BenchmarkWriteFiles100k 115383884 64349922 -44.23%
This change is very important if your code has to write a lot of
tarballs with a lot of files.
LGTM=dsymonds
R=golang-codereviews, dave, dsymonds
CC=golang-codereviews
https://golang.org/cl/107440043
Thanks to Cedric Staub for noting that a short session key would lead
to an out-of-bounds access when conditionally copying the too short
buffer over the random session key.
LGTM=davidben, bradfitz
R=davidben, bradfitz
CC=golang-codereviews
https://golang.org/cl/102670044
The main changes fall into a few patterns:
1. Replace #define with enum.
2. Add /*c2go */ comment giving effect of #define.
This is necessary for function-like #defines and
non-enum-able #defined constants.
(Not all compilers handle negative or large enums.)
3. Add extra braces in struct initializer.
(c2go does not implement the full rules.)
This is enough to let c2go typecheck the source tree.
There may be more changes once it is doing
other semantic analyses.
LGTM=minux, iant
R=minux, dave, iant
CC=golang-codereviews
https://golang.org/cl/106860045
A TLS slot is reserved by _rt0_.*_plan9 as an automatic and
its address (which is static on Plan 9) is saved in the
global _privates symbol. The startup linkage now is exactly
like that from Plan 9 libc, and the way we access g is
exactly as if we'd have used privalloc(2).
Aside from making the code more standard, this change
drastically simplifies it, both for 386 and for amd64, and
makes the Plan 9 code in liblink common for both 386 and
amd64.
The amd64 runtime code was cleared of nxm assumptions, and
now runs on the standard Plan 9 kernel.
Note handling fixes will follow in a separate CL.
LGTM=rsc
R=golang-codereviews, rsc, bradfitz, dave
CC=0intro, ality, golang-codereviews, jas, minux.ma, mischief
https://golang.org/cl/101510049
We restored registers correctly in the usual case where the thread
is a Go-managed thread and called runtime·sighandler, but we
failed to do so when runtime·sigtramp was called on a cgo-created
thread. In that case, runtime·sigtramp called runtime·badsignal,
a Go function, and did not restore registers after it returned
LGTM=rsc, dave
R=rsc, dave
CC=golang-codereviews, minux.ma
https://golang.org/cl/105280050
On amd64, the real time is reduced from 176.76s to 140.26s.
On ARM, the real time is reduced from 921.61s to 726.30s.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/101580043
Move decAlloc calls a bit higher in the call tree.
Cleans code marginally, improves speed marginally.
The benchmarks are noisy but the median time from
20 consective 1-second runs improves by about 2%.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/105530043
3-index slices of the form s[:len(s):len(s)]
cannot be simplified to s[::len(s)].
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/108330043
There is no reason to generate different code for cap and len.
Fixes#8025.
Fixes#8026.
LGTM=rsc
R=rsc, iant, khr
CC=golang-codereviews
https://golang.org/cl/93570044
Breaks windows and race detector.
TBR=rsc
««« original CL description
runtime: stack allocator, separate from mallocgc
In order to move malloc to Go, we need to have a
separate stack allocator. If we run out of stack
during malloc, malloc will not be available
to allocate a new stack.
Stacks are the last remaining FlagNoGC objects in the
GC heap. Once they are out, we can get rid of the
distinction between the allocated/blockboundary bits.
(This will be in a separate change.)
Fixes#7468Fixes#7424
LGTM=rsc, dvyukov
R=golang-codereviews, dvyukov, khr, dave, rsc
CC=golang-codereviews
https://golang.org/cl/104200047
»»»
TBR=rsc
CC=golang-codereviews
https://golang.org/cl/101570044
In order to move malloc to Go, we need to have a
separate stack allocator. If we run out of stack
during malloc, malloc will not be available
to allocate a new stack.
Stacks are the last remaining FlagNoGC objects in the
GC heap. Once they are out, we can get rid of the
distinction between the allocated/blockboundary bits.
(This will be in a separate change.)
Fixes#7468Fixes#7424
LGTM=rsc, dvyukov
R=golang-codereviews, dvyukov, khr, dave, rsc
CC=golang-codereviews
https://golang.org/cl/104200047
The old code's structure needed to track indirections because of the
use of unsafe. That is no longer necessary, so we can remove all
that tracking. The code cleans up considerably but is a little slower.
We may be able to recover that performance drop. I believe the
code quality improvement is worthwhile regardless.
BenchmarkEndToEndPipe 5610 5780 +3.03%
BenchmarkEndToEndByteBuffer 3156 3222 +2.09%
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/103700043
If the actual types of two reflect values are
the same and the values are structs, they must
have the same number of fields.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/108280043
This removes a major unsafe thorn in our side, a perennial obstacle
to clean garbage collection.
Not coincidentally: In cleaning this up, several bugs were found,
including code that reached inside by-value interfaces to create
pointers for pointer-receiver methods. Unsafe code is just as
advertised.
Performance of course suffers, but not too badly. The Pipe number
is more indicative, since it's doing I/O that simulates a network
connection. Plus these are end-to-end, so each end suffers
only half of this pain.
The edit is pretty much a line-by-line conversion, with a few
simplifications and a couple of new tests. There may be more
performance to gain.
BenchmarkEndToEndByteBuffer 2493 3033 +21.66%
BenchmarkEndToEndPipe 4953 5597 +13.00%
Fixes#5159.
LGTM=rsc
R=rsc
CC=golang-codereviews, khr
https://golang.org/cl/102680045
The only text that describes the accepted format is in the package doc,
which is far away from these functions. The other flag types don't need
this explicitness because they are more obvious.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/101550043
Remove GC bitmap backward scanning.
This was already done once in https://golang.org/cl/5530074/
Still makes GC a bit faster.
On the garbage benchmark, before:
gc-pause-one=237345195
gc-pause-total=4746903
cputime=32427775
time=32458208
after:
gc-pause-one=235484019
gc-pause-total=4709680
cputime=31861965
time=31877772
Also prepares mgc0.c for future changes.
R=golang-codereviews, khr, khr
CC=golang-codereviews, rsc
https://golang.org/cl/105380043
newproc takes two extra pointers, not two extra registers.
On amd64p32 (nacl) they are different.
We diagnosed this before the 1.3 cut but the tree was frozen.
I believe this is causing the random problems on the builder.
Fixes#8199.
TBR=r
CC=golang-codereviews
https://golang.org/cl/102710043
Include these files in the build,
even though they don't get executed.
LGTM=r
R=golang-codereviews, r
CC=golang-codereviews
https://golang.org/cl/108180043
Output number of spinning threads,
this is useful to understanding whether the scheduler
is in a steady state or not.
R=golang-codereviews, khr
CC=golang-codereviews, rsc
https://golang.org/cl/103540045
Say when a goroutine is locked to OS thread in crash reports
and goroutine profiles.
It can be useful to understand what goroutines consume OS threads
(syscall and locked), e.g. if you forget to call UnlockOSThread
or leak locked goroutines.
R=golang-codereviews
CC=golang-codereviews, rsc
https://golang.org/cl/94170043
Pending acceptance of CL 101500044
and adjustment of test/fixedbugs/bug299.go.
LGTM=adonovan
R=golang-codereviews, adonovan
CC=golang-codereviews
https://golang.org/cl/110160043
The runtime has historically held two dedicated values g (current goroutine)
and m (current thread) in 'extern register' slots (TLS on x86, real registers
backed by TLS on ARM).
This CL removes the extern register m; code now uses g->m.
On ARM, this frees up the register that formerly held m (R9).
This is important for NaCl, because NaCl ARM code cannot use R9 at all.
The Go 1 macrobenchmarks (those with per-op times >= 10 µs) are unaffected:
BenchmarkBinaryTree17 5491374955 5471024381 -0.37%
BenchmarkFannkuch11 4357101311 4275174828 -1.88%
BenchmarkGobDecode 11029957 11364184 +3.03%
BenchmarkGobEncode 6852205 6784822 -0.98%
BenchmarkGzip 650795967 650152275 -0.10%
BenchmarkGunzip 140962363 141041670 +0.06%
BenchmarkHTTPClientServer 71581 73081 +2.10%
BenchmarkJSONEncode 31928079 31913356 -0.05%
BenchmarkJSONDecode 117470065 113689916 -3.22%
BenchmarkMandelbrot200 6008923 5998712 -0.17%
BenchmarkGoParse 6310917 6327487 +0.26%
BenchmarkRegexpMatchMedium_1K 114568 114763 +0.17%
BenchmarkRegexpMatchHard_1K 168977 169244 +0.16%
BenchmarkRevcomp 935294971 914060918 -2.27%
BenchmarkTemplate 145917123 148186096 +1.55%
Minux previous reported larger variations, but these were caused by
run-to-run noise, not repeatable slowdowns.
Actual code changes by Minux.
I only did the docs and the benchmarking.
LGTM=dvyukov, iant, minux
R=minux, josharian, iant, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/109050043
A single iteration of BenchmarkSaveRestore runs for 5 seconds
on my freebsd machine. 5 seconds looks like too long for a single
iteration.
This is the only benchmark that times out on freebsd-amd64-race builder.
R=golang-codereviews, dave
CC=golang-codereviews
https://golang.org/cl/107340044
Breaks the build
««« original CL description
cmd/go: build test files containing non-runnable examples
Even if we can't run them, we should at least check that they compile.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/107320046
»»»
TBR=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/110140044
Runs for 4 seconds on my mac.
Also this is the only test that times out on freebsd in -race mode.
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/110150045
Even if we can't run them, we should at least check that they compile.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/107320046
This CL re-applies the tests added in CL 101330053 and subsequently rolled back in CL 102610043.
The original author of this change was Rui Ueyama <ruiu@google.com>
LGTM=r, ruiu
R=ruiu, r
CC=golang-codereviews
https://golang.org/cl/109170043
The number of estimated iterations required to reach the benchtime is multiplied by a safety margin (to avoid falling just short) and then rounded up to a readable number. With an accurate estimate, in the worse case, the resulting number of iterations could be 3.75x more than necessary: 1.5x for safety * 2.5x to round up (e.g. from 2eX+1 to 5eX).
This CL reduces the safety margin to 1.2x. Experimentation showed a diminishing margin of return past 1.2x, although the average case continued to show improvements down to 1.05x.
This CL also reduces the maximum round-up multiplier from 2.5x (from 2eX+1 to 5eX) to 2x, by allowing the number of iterations to be of the form 3eX.
Both changes improve benchmark wall clock times, and the effects are cumulative.
From 1.5x to 1.2x safety margin:
package old s new s delta
bytes 163 125 -23%
encoding/json 27 21 -22%
net/http 42 36 -14%
runtime 463 418 -10%
strings 82 65 -21%
Allowing 3eX iterations:
package old s new s delta
bytes 163 134 -18%
encoding/json 27 23 -15%
net/http 42 36 -14%
runtime 463 422 -9%
strings 82 72 -12%
Combined:
package old s new s delta
bytes 163 112 -31%
encoding/json 27 20 -26%
net/http 42 30 -29%
runtime 463 346 -25%
strings 82 60 -27%
LGTM=crawshaw, r, rsc
R=golang-codereviews, crawshaw, r, rsc
CC=golang-codereviews
https://golang.org/cl/105990045
The previous call to parseRange already checks whether
all the ranges start before the end of file.
LGTM=robert.hencke, bradfitz
R=golang-codereviews, robert.hencke, gobot, bradfitz
CC=golang-codereviews
https://golang.org/cl/91880044
Update #1435
This proposal disables Setuid and Setgid on all linux platforms.
Issue 1435 has been open for a long time, and it is unlikely to be addressed soon so an argument was made by a commenter
https://code.google.com/p/go/issues/detail?id=1435#c45
That these functions should made to fail rather than succeed in their broken state.
LGTM=ruiu, iant
R=iant, ruiu
CC=golang-codereviews
https://golang.org/cl/106170043
MOV with SSE registers seems faster than REP MOVSQ if the
size being copied is less than about 2K. Previously we
didn't use MOV if the memory region is larger than 256
byte. This patch improves the performance of 257 ~ 2048
byte non-overlapping copy by using MOV.
Here is the benchmark result on Intel Xeon 3.5GHz (Nehalem).
benchmark old ns/op new ns/op delta
BenchmarkMemmove16 4 4 +0.42%
BenchmarkMemmove32 5 5 -0.20%
BenchmarkMemmove64 6 6 -0.81%
BenchmarkMemmove128 7 7 -0.82%
BenchmarkMemmove256 10 10 +1.92%
BenchmarkMemmove512 29 16 -44.90%
BenchmarkMemmove1024 37 25 -31.55%
BenchmarkMemmove2048 55 44 -19.46%
BenchmarkMemmove4096 92 91 -0.76%
benchmark old MB/s new MB/s speedup
BenchmarkMemmove16 3370.61 3356.88 1.00x
BenchmarkMemmove32 6368.68 6386.99 1.00x
BenchmarkMemmove64 10367.37 10462.62 1.01x
BenchmarkMemmove128 17551.16 17713.48 1.01x
BenchmarkMemmove256 24692.81 24142.99 0.98x
BenchmarkMemmove512 17428.70 31687.72 1.82x
BenchmarkMemmove1024 27401.82 40009.45 1.46x
BenchmarkMemmove2048 36884.86 45766.98 1.24x
BenchmarkMemmove4096 44295.91 44627.86 1.01x
LGTM=khr
R=golang-codereviews, gobot, khr
CC=golang-codereviews
https://golang.org/cl/90500043
sync.Pool is not supposed to be used everywhere, but is
a last resort.
««« original CL description
strings: use sync.Pool to cache buffer
benchmark old ns/op new ns/op delta
BenchmarkByteReplacerWriteString 3596 3094 -13.96%
benchmark old allocs new allocs delta
BenchmarkByteReplacerWriteString 1 0 -100.00%
LGTM=dvyukov
R=bradfitz, dave, dvyukov
CC=golang-codereviews
https://golang.org/cl/101330053
»»»
LGTM=dave
R=r, dave
CC=golang-codereviews
https://golang.org/cl/102610043
This requires minimal changes to the runtime hooks. In particular,
synchronization events must be done only on valid addresses now,
so I've added the additional checks to race.c.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/101000046
benchmark old ns/op new ns/op delta
BenchmarkByteReplacerWriteString 7359 3661 -50.25%
LGTM=dave
R=golang-codereviews, dave
CC=golang-codereviews
https://golang.org/cl/102550043
Afterprologue check was required when did not know
about return arguments of functions and/or they were not zeroed.
Now 100% precision is required for stacks due to stack copying,
so it must work w/o afterprologue one way or another.
I can limit this change for 1.3 to merely adding a TODO,
but this check is super confusing so I don't want this knowledge to get lost.
LGTM=rsc
R=golang-codereviews, gobot, rsc, khr
CC=golang-codereviews, khr, rsc
https://golang.org/cl/96580045
Use WriteString instead of allocating a byte slice as a
buffer. This was a TODO.
benchmark old ns/op new ns/op delta
BenchmarkWriteString 40139 19991 -50.20%
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/107190044
requires a decoder to do its own byte buffering instead of using
bufio.Reader, due to byte stuffing.
benchmark old MB/s new MB/s speedup
BenchmarkDecodeBaseline 33.40 50.65 1.52x
BenchmarkDecodeProgressive 24.34 31.92 1.31x
On 6g, unsafe.Sizeof(huffman{}) falls from 4872 to 964 bytes, and
the decoder struct contains 8 of those.
LGTM=r
R=r, nightlyone
CC=bradfitz, couchmoney, golang-codereviews, raph
https://golang.org/cl/109050045
Storing temporary values to a slice is slower than storing
them to local variables of type byte.
benchmark old MB/s new MB/s speedup
BenchmarkEncodeToStringBase32 102.21 156.66 1.53x
BenchmarkEncodeToStringBase64 124.25 177.91 1.43x
LGTM=crawshaw
R=golang-codereviews, crawshaw, bradfitz, dave
CC=golang-codereviews
https://golang.org/cl/109820045
Just to be more thorough.
No need to push this to 1.3; it's just a test change that
worked without any changes to the code being tested.
LGTM=crawshaw
R=golang-codereviews, crawshaw
CC=golang-codereviews
https://golang.org/cl/109080045
genericReplacer.lookup is called for each byte of an input
string. In many (most?) cases, lookup will fail for the first
byte, and it will return immediately. Adding a fast path for
that case seems worth it.
Benchmark on my Xeon 3.5GHz Linux box:
benchmark old ns/op new ns/op delta
BenchmarkGenericNoMatch 2691 774 -71.24%
BenchmarkGenericMatch1 7920 8151 +2.92%
BenchmarkGenericMatch2 52336 39927 -23.71%
BenchmarkSingleMaxSkipping 1575 1575 +0.00%
BenchmarkSingleLongSuffixFail 1429 1429 +0.00%
BenchmarkSingleMatch 56228 55444 -1.39%
BenchmarkByteByteNoMatch 568 568 +0.00%
BenchmarkByteByteMatch 977 972 -0.51%
BenchmarkByteStringMatch 1669 1687 +1.08%
BenchmarkHTMLEscapeNew 422 422 +0.00%
BenchmarkHTMLEscapeOld 692 670 -3.18%
BenchmarkByteByteReplaces 8492 8474 -0.21%
BenchmarkByteByteMap 2817 2808 -0.32%
LGTM=rsc
R=golang-codereviews, bradfitz, dave, rsc
CC=golang-codereviews
https://golang.org/cl/79200044
Bug was introduced recently. Add more tests, fix the bugs.
Suppress + sign when not required in zero padding.
Do not zero pad infinities.
All old tests still pass.
This time for sure!
Fixes#8217.
LGTM=rsc
R=golang-codereviews, dan.kortschak, rsc
CC=golang-codereviews
https://golang.org/cl/103480043
The go:nosplit change wasn't the problem, reinstating.
««« original CL description
undo CL 93380044 / 7f0999348917
Partial undo, just of go:nosplit annotation. Somehow it
is breaking the windows builders.
TBR=bradfitz
««« original CL description
runtime: implement string ops in Go
Also implement go:nosplit annotation. Not really needed
for now, but we'll definitely need it for other conversions.
benchmark old ns/op new ns/op delta
BenchmarkRuneIterate 534 474 -11.24%
BenchmarkRuneIterate2 535 470 -12.15%
LGTM=bradfitz
R=golang-codereviews, dave, bradfitz, minux
CC=golang-codereviews
https://golang.org/cl/93380044
»»»
TBR=bradfitz
CC=golang-codereviews
https://golang.org/cl/105260044
»»»
TBR=bradfitz
R=bradfitz, golang-codereviews
CC=golang-codereviews
https://golang.org/cl/103490043
Partial undo, just of go:nosplit annotation. Somehow it
is breaking the windows builders.
TBR=bradfitz
««« original CL description
runtime: implement string ops in Go
Also implement go:nosplit annotation. Not really needed
for now, but we'll definitely need it for other conversions.
benchmark old ns/op new ns/op delta
BenchmarkRuneIterate 534 474 -11.24%
BenchmarkRuneIterate2 535 470 -12.15%
LGTM=bradfitz
R=golang-codereviews, dave, bradfitz, minux
CC=golang-codereviews
https://golang.org/cl/93380044
»»»
TBR=bradfitz
CC=golang-codereviews
https://golang.org/cl/105260044
Also implement go:nosplit annotation. Not really needed
for now, but we'll definitely need it for other conversions.
benchmark old ns/op new ns/op delta
BenchmarkRuneIterate 534 474 -11.24%
BenchmarkRuneIterate2 535 470 -12.15%
LGTM=bradfitz
R=golang-codereviews, dave, bradfitz, minux
CC=golang-codereviews
https://golang.org/cl/93380044
We don't need to shift array elements to shuffle them.
We just have to swap a selected element with 0th element.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/91750044
Printf("%x", "abc") was "0x610x620x63"; is now "0x616263", which
is surely better.
Printf("% #x", "abc") is still "0x61 0x62 0x63".
Fixes#8080.
LGTM=bradfitz, gri
R=golang-codereviews, bradfitz, gri
CC=golang-codereviews
https://golang.org/cl/106990043
Also added a test to verify os.Getppid() works across all platforms
LGTM=alex.brainman
R=golang-codereviews, alex.brainman, shreveal, iant
CC=golang-codereviews
https://golang.org/cl/102320044
Reportedly in the Linux 3.16 kernel the VDSO will not have
section headers or a normal symbol table.
Too late for 1.3 but perhaps for 1.3.1, if there is one.
Fixes#8197.
LGTM=rsc
R=golang-codereviews, mattn.jp, rsc
CC=golang-codereviews
https://golang.org/cl/101260044
bufio.Scanner.Scan returns whether the scan succeeded, not whether it
is done, so the test was mistakenly breaking early.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/93670045
makes windows-amd64-race benchmarks slower
««« original CL description
testing: make benchmarking faster
Allow the number of benchmark iterations to grow faster for fast benchmarks, and don't round up twice.
Using the default benchtime, this CL reduces wall clock time to run benchmarks:
net/http 49s -> 37s (-24%)
runtime 8m31s -> 5m55s (-30%)
bytes 2m37s -> 1m29s (-43%)
encoding/json 29s -> 21s (-27%)
strings 1m16s -> 53s (-30%)
LGTM=crawshaw
R=golang-codereviews, crawshaw
CC=golang-codereviews
https://golang.org/cl/101970047
»»»
TBR=josharian
CC=golang-codereviews
https://golang.org/cl/105950044
It appears that something about Go on Windows
cannot handle the fault cause by a jump to address 0.
The way Go represents and calls functions, this
never happened at all, until CL 105140044.
This CL changes the code added in CL 105140044
to make jump to 0 impossible once again.
Fixes#8047. (again, on Windows)
TBR=bradfitz
R=golang-codereviews, dave
CC=adg, golang-codereviews, iant, r
https://golang.org/cl/105120044
jmpdefer modifies PC, SP, and LR, and not atomically,
so walking past jmpdefer will often end up in a state
where the three are not a consistent execution snapshot.
This was causing warning messages a few frames later
when the traceback realized it was confused, but given
the right memory it could easily crash instead.
Update #8153
LGTM=minux, iant
R=golang-codereviews, minux, iant
CC=golang-codereviews, r
https://golang.org/cl/107970043
The test requires that timerproc runs, but busy loops and starves
the scheduler so that, with high probability, timerproc doesn't run.
Avoid the issue by expecting the test to succeed; if not, a major
outside timeout will kill it and let us know.
As you can see from the diffs, there have been several attempts to
fix this with chicanery, but none has worked. Don't bother trying
any more.
Fixes#8136.
LGTM=rsc
R=rsc, josharian
CC=golang-codereviews
https://golang.org/cl/105140043
Allow the number of benchmark iterations to grow faster for fast benchmarks, and don't round up twice.
Using the default benchtime, this CL reduces wall clock time to run benchmarks:
net/http 49s -> 37s (-24%)
runtime 8m31s -> 5m55s (-30%)
bytes 2m37s -> 1m29s (-43%)
encoding/json 29s -> 21s (-27%)
strings 1m16s -> 53s (-30%)
LGTM=crawshaw
R=golang-codereviews, crawshaw
CC=golang-codereviews
https://golang.org/cl/101970047
Call copy with as large buffer as possible to reduce the
number of function calls.
benchmark old ns/op new ns/op delta
BenchmarkBytesRepeat 540 162 -70.00%
BenchmarkStringsRepeat 563 177 -68.56%
LGTM=josharian
R=golang-codereviews, josharian, dave, dvyukov
CC=golang-codereviews
https://golang.org/cl/90550043
Previously, an input string was stripped of newline
characters at the beginning of DecodeString and then passed
to Decode. Decode again tried to strip newline characters.
That's waste of time.
benchmark old MB/s new MB/s speedup
BenchmarkDecodeString 38.37 65.20 1.70x
LGTM=dave, bradfitz
R=golang-codereviews, dave, bradfitz
CC=golang-codereviews
https://golang.org/cl/91770051
There is a hierarchy of location defined by loop depth:
-1 = the heap
0 = function results
1 = local variables (and parameters)
2 = local variable declared inside a loop
3 = local variable declared inside a loop inside a loop
etc
In general if an address from loopdepth n is assigned to
something in loop depth m < n, that indicates an extended
lifetime of some form that requires a heap allocation.
Function results can be local variables too, though, and so
they don't actually fit into the hierarchy very well.
Treat the address of a function result as level 1 so that
if it is written back into a result, the address is treated
as escaping.
Fixes#8185.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/108870044
Provide Nextafter64 as alias to Nextafter.
For submission after the 1.3 release.
Fixes#8117.
LGTM=adonovan
R=adonovan
CC=golang-codereviews
https://golang.org/cl/101750048
The analysis for &x was using the loop depth on x set
during x's declaration. A type switch creates a list of
implicit declarations that were not getting initialized
with loop depths.
Fixes#8176.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/108860043
The putpclcdelta function set the DWARF line number PC to
s->value + pcline->pc, which is correct, but the code then set
the local variable pc to epc, which can be a different value.
This caused the next delta in the DWARF table to be wrong.
Fixes#8098.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/104950045
A runtime.Goexit during a panic-invoked deferred call
left the panic stack intact even though all the stack frames
are gone when the goroutine is torn down.
The next goroutine to reuse that struct will have a
bogus panic stack and can cause the traceback routines
to walk into garbage.
Most likely to happen during tests, because t.Fatal might
be called during a deferred func and uses runtime.Goexit.
This "not enough cleared in Goexit" failure mode has
happened to us multiple times now. Clear all the pointers
that don't make sense to keep, not just gp->panic.
Fixes#8158.
LGTM=iant, dvyukov
R=iant, dvyukov
CC=golang-codereviews
https://golang.org/cl/102220043
I am not sure what the rounding here was
trying to do, but it was skipping the first
pointer on native client.
The code above the rounding already checks
that xoffset is widthptr-aligned, so the rnd
was a no-op everywhere but on Native Client.
And on Native Client it was wrong.
Perhaps it was supposed to be rounding down,
not up, but zerorange handles the extra 32 bits
correctly, so the rnd does not seem to be necessary
at all.
This wouldn't be worth doing for Go 1.3 except
that it can affect code on the playground.
Fixes#8155.
LGTM=r, iant
R=golang-codereviews, r, iant
CC=dvyukov, golang-codereviews, khr
https://golang.org/cl/108740047
It's not clear how widespread this issue is, but we do have a
test case generated by a development version of clang.
I don't know whether this should go into 1.3 or not; happy to
hear arguments either way.
LGTM=rsc
R=golang-codereviews, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/96680045
I introduced this bug when I changed the escape
analysis to run in phases based on call graph
dependency order, in order to be more precise about
inputs escaping back to outputs (functions returning
their arguments).
Given
func f(z **int) *int { return *z }
we were tagging the function as 'z does not escape
and is not returned', which is all true, but not
enough information.
If used as:
var x int
p := &x
q := &p
leak(f(q))
then the compiler might try to keep x, p, and q all
on the stack, since (according to the recorded
information) nothing interesting ends up being
passed to leak.
In fact since f returns *q = p, &x is passed to leak
and x needs to be heap allocated.
To trigger the bug, you need a chain that the
compiler wants to keep on the stack (like x, p, q
above), and you need a function that returns an
indirect of its argument, and you need to pass the
head of the chain to that function. This doesn't
come up very often: this bug has been present since
June 2012 (between Go 1 and Go 1.1) and we haven't
seen it until now. It helps that most functions that
return indirects are getters that are simple enough
to be inlined, avoiding the bug.
Earlier versions of Go also had the benefit that if
&x really wasn't used beyond x's lifetime, nothing
broke if you put &x in a heap-allocated structure
accidentally. With the new stack copying, though,
heap-allocated structures containing &x are not
updated when the stack is copied and x moves,
leading to crashes in Go 1.3 that were not crashes
in Go 1.2 or Go 1.1.
The fix is in two parts.
First, in the analysis of a function, recognize when
a value obtained via indirect of a parameter ends up
being returned. Mark those parameters as having
content escape back to the return results (but we
don't bother to write down which result).
Second, when using the analysis to analyze, say,
f(q), mark parameters with content escaping as
having any indirections escape to the heap. (We
don't bother trying to match the content to the
return value.)
The fix could be less precise (simpler).
In the first part we might mark all content-escaping
parameters as plain escaping, and then the second
part could be dropped. Or we might assume that when
calling f(q) all the things pointed at by q escape
always (for any f and q).
The fix could also be more precise (more complex).
We might record the specific mapping from parameter
to result along with the number of indirects from the
parameter to the thing being returned as the result,
and then at the call sites we could set up exactly the
right graph for the called function. That would make
notleaks(f(q)) be able to keep x on the stack, because
the reuslt of f(q) isn't passed to anything that leaks it.
The less precise the fix, the more stack allocations
become heap allocations.
This fix is exactly as precise as it needs to be so that
none of the current stack allocations in the standard
library turn into heap allocations.
Fixes#8120.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, khr, r
https://golang.org/cl/102040046
The 'address taken' bit in a function variable was not
propagating into the inlined copies, causing incorrect
liveness information.
LGTM=dsymonds, bradfitz
R=golang-codereviews, bradfitz
CC=dsymonds, golang-codereviews, iant, khr, r
https://golang.org/cl/96670046
The 1-byte write was silently clearing a byte on the stack.
If there was another function call with more arguments
in the same stack frame, no harm done.
Otherwise, if the variable at that location was already zero,
no harm done.
Otherwise, problems.
Fixes#8139.
LGTM=dsymonds
R=golang-codereviews, dsymonds
CC=golang-codereviews, iant, r
https://golang.org/cl/100940043
This is a workaround - the code should be better than this - but the
fix avoids generating large numbers of linehist entries for the wrapper
functions that enable interface conversions. There can be many of
them, they all happen at the end of compilation, and they can all
share a linehist entry.
Avoids bad n^2 behavior in liblink.
Test case in issue 8135 goes from 64 seconds to 2.5 seconds (still bad
but not intolerable).
Fixes#8135.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/104840043
If we see a typedef to an anonymous struct more than once,
presumably in two different Go files that import "C", use the
same Go type name.
Fixes#8133.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/102080043