The test fails now with -race, so it's disabled.
The intention is that the fix for issue 7264
will also modify this test the same way and enable it.
Reporduce with 'go test -race -issue7264 -cpu=4'.
Update #7264
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/86770043
Add test for multipart form requests with an invalid content-type to ensure
ErrNotMultipart is returned.
Change ParseMultipartForm to return ErrNotMultipart when it is returned by multipartReader.
Modify test for empty multipart request handling to use POST so that the body is checked.
Fixes#6334.
This is the first changeset working on multipart request handling. Further changesets
could add more tests and clean up the TODO.
LGTM=bradfitz
R=golang-codereviews, gobot, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/44040043
I was implementing rules from RFC 2616. The rules are apparently useless,
ambiguous, and too strict for common software on the Internet. (e.g. curl)
Add more tests, including a test of a chunked request.
Fixes#7625
LGTM=dsymonds
R=golang-codereviews, dsymonds
CC=adg, golang-codereviews, rsc
https://golang.org/cl/84480045
Also: Simplify ReadSlice implementation and
ensure that it doesn't call fill() with a full
buffer (this caused a failure in net/textproto
TestLargeReadMIMEHeader because fill() wasn't able
to read more data).
Fixes#7745.
LGTM=bradfitz
R=r, bradfitz
CC=golang-codereviews
https://golang.org/cl/86590043
To create a valid JSON string, "%s" is not enough.
Fixes#7761.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/86730043
It runs too long in -short mode.
Disable the one in init, because it doesn't respect -short.
Make the part that claims to test execution in a finalizer
actually execute the test in the finalizer.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=aram.h, golang-codereviews, iant, khr
https://golang.org/cl/86550045
The Go HTTP server doesn't use Response.Write, but others do,
so make it correct. Add a bunch more tests.
This bug is almost a year old. :/
Fixes#5381
LGTM=adg
R=golang-codereviews, adg
CC=dsymonds, golang-codereviews, rsc
https://golang.org/cl/85740046
Go's had pretty decent HTTP Trailer support for a long time, but
the docs have been largely non-existent. Fix that.
In the process, re-learn the Trailer code, clean some stuff
up, add some error checks, remove some TODOs, fix a minor bug
or two, and add tests.
LGTM=adg
R=golang-codereviews, adg
CC=dsymonds, golang-codereviews, rsc
https://golang.org/cl/86660043
There is a race condition that causes spurious wakeup from Wait
in the following case:
G1: decrement wg.counter, observe the counter is now 0
(should unblock goroutines queued *at this moment*)
G2: increment wg.counter
G2: call Wait() to add itself to the wait queue
G1: acquire wg.m, unblock all waiting goroutines
In the last step G2 is spuriously woken up by G1.
Fixes#7734.
LGTM=rsc, dvyukov
R=dvyukov, 0xjnml, rsc
CC=golang-codereviews
https://golang.org/cl/85580043
In a typical HTTP request, the client writes the request, and
then the server replies. Go's HTTP client code (Transport) has
two goroutines per connection: one writing, and one reading. A
third goroutine (the one initiating the HTTP request)
coordinates with those two.
Because most HTTP requests are done when the server replies,
the Go code has always handled connection reuse purely in the
readLoop goroutine.
But if a client is writing a large request and the server
replies before it's consumed the entire request (e.g. it
replied with a 403 Forbidden and had no use for the body), it
was possible for Go to re-select that connection for a
subsequent request before we were done writing the first. That
wasn't actually a data race; the second HTTP request would
just get enqueued to write its request on the writeLoop. But
because the previous writeLoop didn't finish writing (and
might not ever), that connection is in a weird state. We
really just don't want to get into a state where we're
re-using a connection when the server spoke out of turn.
This CL changes the readLoop goroutine to verify that the
writeLoop finished before returning the connection.
In the process, it also fixes a potential goroutine leak where
a connection could close but the recycling logic could be
blocked forever waiting for the client to read to EOF or
error. Now it also selects on the persistConn's close channel,
and the closer of that is no longer the readLoop (which was
dead locking in some cases before). It's now closed at the
same place the underlying net.Conn is closed. This likely fixes
or helps Issue 7620.
Also addressed some small cosmetic things in the process.
Update #7620Fixes#7569
LGTM=adg
R=golang-codereviews, adg
CC=dsymonds, golang-codereviews, rsc
https://golang.org/cl/86290043
We originally decided to skip this test in short mode
to prevent the parallel runtime test to timeout on the
Plan 9 builder. This should no longer be required since
the issue was fixed in CL 86210043.
LGTM=dave, bradfitz
R=dvyukov, dave, bradfitz
CC=golang-codereviews, rsc
https://golang.org/cl/84790044
If you pass ns = 100,000 to this function, timediv will
return ms = 0. tsemacquire in /sys/src/9/port/sysproc.c
will return immediately when ms == 0 and the semaphore
cannot be acquired immediately - it doesn't sleep - so
notetsleep will spin, chewing cpu and repeatedly reading
the time, until the 100us have passed.
Thanks to the time reads it won't take too many iterations,
but whatever we are waiting for does not get a chance to
run. Eventually the notetsleep spin loop returns and we
end up in the stoptheworld spin loop - actually a sleep
loop but we're not doing a good job of sleeping.
After 100ms or so of this, the kernel says enough and
schedules a different thread. That thread manages to do
whatever we're waiting for, and the spinning in the other
thread stops. If tsemacquire had actually slept, this
would have happened much quicker.
Many thanks to Russ Cox for help debugging.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/86210043
even when there are no *_test.go files present.
rsc suggested this change
Fixes#7108
LGTM=r, adg
R=golang-codereviews, r, adg
CC=golang-codereviews
https://golang.org/cl/84300043
The buffer length should be the size in bytes
instead of the number of structs.
Fixes#6588.
LGTM=mikioh.mikioh
R=golang-codereviews, mikioh.mikioh, adg
CC=golang-codereviews
https://golang.org/cl/84830043
Explain what its purpose is and give examples of good and bad use.
Fixes#7167.
LGTM=dvyukov, rsc
R=golang-codereviews, dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/85880044
On amd64p32 pointers are 32-bit-aligned and cannot be assumed to
have an offset multiple of widthreg. Instead check that they are
withptr-aligned.
Also change the threshold for region merging to 2*widthreg
instead of 2*widthptr because performance on amd64 and amd64p32
is expected to be the same.
Fixes#7712.
LGTM=khr
R=rsc, dave, khr, brad, bradfitz
CC=golang-codereviews
https://golang.org/cl/84690044
Cuts the number of calls from 6 to 2 in the non-debug case.
LGTM=iant
R=golang-codereviews, iant
CC=0intro, aram, golang-codereviews, khr
https://golang.org/cl/86040043
1) The code to catch an exception marked the template as escaped
when it was not yet, which caused subsequent executions of the
template to not escape properly.
2) ensurePipelineContains needs to handled Field as well as
Identifier nodes.
Fixes#7379.
LGTM=mikesamuel
R=mikesamuel
CC=golang-codereviews
https://golang.org/cl/85240043
Getenv() should not call malloc when called from
gotraceback(). Instead, we return a static buffer
in this case, with enough room to hold the longest
value.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/85680043
On Plan 9 gotraceback calls getenv calls malloc, and we gotraceback
on every call to gentraceback, which happens during garbage collection.
Honestly I don't even know how this works on Plan 9.
I suspect it does not, and that we are getting by because
no one has tried to run with $GOTRACEBACK set at all.
This will speed up all the other systems by epsilon, since they
won't call getenv and atoi repeatedly.
LGTM=bradfitz
R=golang-codereviews, bradfitz, 0intro
CC=golang-codereviews
https://golang.org/cl/85430046
Add a $Context variable to the template so that the build.Context values
such as BuildTags can be accessed.
Fixes#6666.
LGTM=adg, rsc
R=golang-codereviews, gobot, adg, rsc
CC=golang-codereviews
https://golang.org/cl/72770043
Short circuit for calling values funcs by MakeFunc was placed
before variadic arg rearrangement code in reflect.call.
Fixes#7534.
LGTM=khr
R=golang-codereviews, bradfitz, khr, rsc
CC=golang-codereviews
https://golang.org/cl/75370043
Now that we have a constant-time P-256 implementation, it's worth
paying more attention elsewhere.
The inversion of k in (EC)DSA was using Euclid's algorithm which isn't
constant-time. This change switches to Fermat's algorithm, which is
much better. However, it's important to note that math/big itself isn't
constant time and is using a 4-bit window for exponentiation with
variable memory access patterns.
(Since math/big depends quite deeply on its values being in minimal (as
opposed to fixed-length) represetation, perhaps crypto/elliptic should
grow a constant-time implementation of exponentiation in the scalar
field.)
R=bradfitz
Fixes#7652.
LGTM=rsc
R=golang-codereviews, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/82740043
Given
type Outer struct {
*Inner
...
}
the compiler generates the implementation of (*Outer).M dispatching to
the embedded Inner. The implementation is logically:
func (p *Outer) M() {
(p.Inner).M()
}
but since the only change here is the replacement of one pointer
receiver with another, the actual generated code overwrites the
original receiver with the p.Inner pointer and then jumps to the M
method expecting the *Inner receiver.
During reflect.Value.Call, we create an argument frame and the
associated data structures to describe it to the garbage collector,
populate the frame, call reflect.call to run a function call using
that frame, and then copy the results back out of the frame. The
reflect.call function does a memmove of the frame structure onto the
stack (to set up the inputs), runs the call, and the memmoves the
stack back to the frame structure (to preserve the outputs).
Originally reflect.call did not distinguish inputs from outputs: both
memmoves were for the full stack frame. However, in the case where the
called function was one of these wrappers, the rewritten receiver is
almost certainly a different type than the original receiver. This is
not a problem on the stack, where we use the program counter to
determine the type information and understand that during (*Outer).M
the receiver is an *Outer while during (*Inner).M the receiver in the
same memory word is now an *Inner. But in the statically typed
argument frame created by reflect, the receiver is always an *Outer.
Copying the modified receiver pointer off the stack into the frame
will store an *Inner there, and then if a garbage collection happens
to scan that argument frame before it is discarded, it will scan the
*Inner memory as if it were an *Outer. If the two have different
memory layouts, the collection will intepret the memory incorrectly.
Fix by only copying back the results.
Fixes#7725.
LGTM=khr
R=khr
CC=dave, golang-codereviews
https://golang.org/cl/85180043
It turns out there is a relatively common pattern that relies on
inverted channel semaphore:
gate := make(chan bool, N)
for ... {
// limit concurrency
gate <- true
go func() {
foo(...)
<-gate
}()
}
// join all goroutines
for i := 0; i < N; i++ {
gate <- true
}
So handle synchronization on inverted semaphores with cap>1.
Fixes#7718.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/84880046
This code tests linkmode == LinkExternal but is only invoked
by the compiler/assembler, not the linker.
Update #7164
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/85080043
Defers generated from cgo lie to us about their argument layout.
Mark those defers as not copyable.
CL 83820043 contains an additional test for this code and should be
checked in (and enabled) after this change is in.
Fixes bug 7695.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/84740043
Iterate the right number of times in arrays and channels.
Handle channels with zero-sized objects in them.
Output longer type names if we have them.
Compute argument offset correctly.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/82980043
Also makes ErrWriteToConnected more appropriate; it's used
not only UDPConn operations but UnixConn operations.
Update #4856
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/84800044
The prefix was not uniformly applied and is probably better
left off for using with OpError.
Update #4856
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/84660046
Takes advantage of CL 83740044, to optimize map[string] lookup
from []byte key.
Deletes code.
No conditional check for gccgo, since Ian plans to add this
to gccgo before GCC 4.10 (Go 1.3).
benchmark old ns/op new ns/op delta
BenchmarkReadMIMEHeader 6066 5086 -16.16%
benchmark old allocs new allocs delta
BenchmarkReadMIMEHeader 12 12 +0.00%
benchmark old bytes new bytes delta
BenchmarkReadMIMEHeader 1317 1317 +0.00%
Update #3512
LGTM=rsc
R=rsc, dave
CC=golang-codereviews, iant
https://golang.org/cl/84230043
Superficial inconsistencies that trigger warnings in
Plan 9. Small enough to be considered trivial and
seemingly benign outside of the Plan 9 environment.
LGTM=iant
R=golang-codereviews, 0intro, iant
CC=golang-codereviews
https://golang.org/cl/73460043
If an error happens on a connection, server goroutine can call b.Logf
after benchmark finishes.
So join both client and server goroutines.
Update #7718
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/84750047
I have no idea what this code is for, but it pretty
clearly needs to be uint64, not uint32.
LGTM=aram
R=0intro, aram
CC=golang-codereviews
https://golang.org/cl/84410043
Native Client forbids jumps/calls to arbitrary locations and
enforces a particular alignement, which makes the Duff's device
ineffective.
LGTM=khr
R=rsc, dave, khr
CC=golang-codereviews
https://golang.org/cl/84400043
Trying to make GODEBUG=gcdead=1 work with liveness
and in particular ambiguously live variables.
1. In the liveness computation, mark all ambiguously live
variables as live for the entire function, except the entry.
They are zeroed directly after entry, and we need them not
to be poisoned thereafter.
2. In the liveness computation, compute liveness (and deadness)
for all parameters, not just pointer-containing parameters.
Otherwise gcdead poisons untracked scalar parameters and results.
3. Fix liveness debugging print for -live=2 to use correct bitmaps.
(Was not updated for compaction during compaction CL.)
4. Correct varkill during map literal initialization.
Was killing the map itself instead of the inserted value temp.
5. Disable aggressive varkill cleanup for call arguments if
the call appears in a defer or go statement.
6. In the garbage collector, avoid bug scanning empty
strings. An empty string is two zeros. The multiword
code only looked at the first zero and then interpreted
the next two bits in the bitmap as an ordinary word bitmap.
For a string the bits are 11 00, so if a live string was zero
length with a 0 base pointer, the poisoning code treated
the length as an ordinary word with code 00, meaning it
needed poisoning, turning the string into a poison-length
string with base pointer 0. By the same logic I believe that
a live nil slice (bits 11 01 00) will have its cap poisoned.
Always scan full multiword struct.
7. In the runtime, treat both poison words (PoisonGC and
PoisonStack) as invalid pointers that warrant crashes.
Manual testing as follows:
- Create a script called gcdead on your PATH containing:
#!/bin/bash
GODEBUG=gcdead=1 GOGC=10 GOTRACEBACK=2 exec "$@"
- Now you can build a test and then run 'gcdead ./foo.test'.
- More importantly, you can run 'go test -short -exec gcdead std'
to run all the tests.
Fixes#7676.
While here, enable the precise scanning of slices, since that was
disabled due to bugs like these. That now works, both with and
without gcdead.
Fixes#7549.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83410044
runtime.Version() requires a trailing "+" when
tree had local modifications at time of build.
Fixes#7701
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/84040045
The garbage collector poison pointers
(0x6969696969696969 and 0x6868686868686868)
are malformed addresses on amd64.
That is, they are not 48-bit addresses sign extended
to 64 bits. This causes a different kind of hardware fault
than the usual 'unmapped page' when accessing such
an address, and OS X 10.9.2 sends the resulting SIGSEGV
incorrectly, making it look like it was user-generated
rather than kernel-generated and does not include the
faulting address. This means that in GODEBUG=gcdead=1
mode, if there is a bug and something tries to dereference
a poisoned pointer, the runtime delivers the SIGSEGV to
os/signal and returns to the faulting code, which faults
again, causing the process to hang instead of crashing.
Fix by rewriting "user-generated" SIGSEGV on OS X to
look like a kernel-generated SIGSEGV with fault address
0xb01dfacedebac1e.
I chose that address because (1) when printed in hex
during a crash, it is obviously spelling out English text,
(2) there are no current Google hits for that pointer,
which will make its origin easy to find once this CL
is indexed, and (3) it is not an altogether inaccurate
description of the situation.
Add a test. Maybe other systems will break too.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant, ken
https://golang.org/cl/83270049
Delaying the runtime.throw until here will print more information.
In particular it will print the signal and code values, which means
it will show the fault address.
The canpanic checks were added recently, in CL 75320043.
They were just not added in exactly the right place.
LGTM=iant
R=dvyukov, iant
CC=golang-codereviews
https://golang.org/cl/83980043
Brad has been asking for this for a while.
I have resisted because I wanted to find a more general way to
do this, one that would keep the performance of code introducing
variables the same as the performance of code that did not.
(See golang.org/issue/3512#c20).
I have not found the more general way, and recent changes to
remove ambiguously live temporaries have blown away the
property I was trying to preserve, so that's no longer a reason
not to make the change.
Fixes#3512.
LGTM=iant
R=iant
CC=bradfitz, golang-codereviews, khr, r
https://golang.org/cl/83740044
Cuts hello world by 70kB, because we don't write
those names into the symbol table.
Update #6853
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80370045
This is just testing the status quo, so that any future attempt
to change it will make the test break and redirect the person
making the change to look at issue 6027.
Fixes#6027.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/83930046
Supports all the current GNU tar sparse formats, including the
old GNU format and the GNU PAX format versions 0.0, 0.1, and 1.0.
Fixes#3864.
LGTM=rsc
R=golang-codereviews, dave, gobot, dsymonds, rsc
CC=golang-codereviews
https://golang.org/cl/64740043
The software floating point runs with m->locks++
to avoid being preempted; recognize this case in panic
and undo it so that m->locks is maintained correctly
when panicking.
Fixes#7553.
LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews
https://golang.org/cl/84030043
The old limit of 5 was chosen because we didn't actually know how
many bytes of arguments there were; 5 was a halfway point between
printing some useful information and looking ridiculous.
Now we know how many bytes of arguments there are, and we stop
the printing when we reach that point, so the "looking ridiculous" case
doesn't happen anymore: we only print actual argument words.
The cutoff now serves only to truncate very long (but real) argument lists.
In multiple debugging sessions recently (completely unrelated bugs)
I have been frustrated by not seeing more of the long argument lists:
5 words is only 2.5 interface values or strings, and not even 2 slices.
Double the max amount we'll show.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews, iant, r
https://golang.org/cl/83850043
The data field is the generic array that acts as a standin
for the keys and values arrays for the generic runtime code.
We want to substitute the keys and values arrays for the data
array, not just add keys and values in addition to it.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/81160044
Darwin 10.6 (gcc 4.2) and some older versions of gcc default to C90 mode, not C99 mode. Silence the warning.
LGTM=aram, iant
R=golang-codereviews, aram, iant
CC=golang-codereviews
https://golang.org/cl/83090050
There is no way to call them from outside the net package.
They are used to implement UCPConn.ReadMsgUDP and similar.
LGTM=mikioh.mikioh
R=golang-codereviews, mikioh.mikioh
CC=golang-codereviews
https://golang.org/cl/83730044
Reduce footprint of liveness bitmaps by about 5x.
1. Mark all liveness bitmap symbols as 4-byte aligned
(they were aligned to a larger size by default).
2. The bitmap data is a bitmap count n followed by n bitmaps.
Each bitmap begins with its own count m giving the number
of bits. All the m's are the same for the n bitmaps.
Emit this bitmap length once instead of n times.
3. Many bitmaps within a function have the same bit values,
but each call site was given a distinct bitmap. Merge duplicate
bitmaps so that no bitmap is written more than once.
4. Many functions end up with the same aggregate bitmap data.
We used to name the bitmap data funcname.gcargs and funcname.gclocals.
Instead, name it gclocals.<md5 of data> and mark it dupok so
that the linker coalesces duplicate sets. This cut the bitmap
data remaining after step 3 by 40%; I was not expecting it to
be quite so dramatic.
Applied to "go build -ldflags -w code.google.com/p/go.tools/cmd/godoc":
bitmaps pclntab binary on disk
before this CL 1326600 1985854 12738268
4-byte align 1154288 (0.87x) 1985854 (1.00x) 12566236 (0.99x)
one bitmap len 782528 (0.54x) 1985854 (1.00x) 12193500 (0.96x)
dedup bitmap 414748 (0.31x) 1948478 (0.98x) 11787996 (0.93x)
dedup bitmap set 245580 (0.19x) 1948478 (0.98x) 11620060 (0.91x)
While here, remove various dead blocks of code from plive.c.
Fixes#6929.
Fixes#7568.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83630044
warning: src/cmd/8g/ggen.c:35 non-interruptable temporary
warning: src/cmd/gc/walk.c:656 set and not used: l
warning: src/cmd/gc/walk.c:658 set and not used: l
LGTM=minux.ma
R=golang-codereviews, minux.ma
CC=golang-codereviews
https://golang.org/cl/83660043
1. Use n->alloc, not n->left, to hold the allocated temp being
passed from orderstmt/orderexpr to walk.
2. Treat method values the same as closures.
3. Use killed temporary for composite literal passed to
non-escaping function argument.
4. Clean temporaries promptly in if and for statements.
5. Clean temporaries promptly in select statements.
As part of this, move all the temporary-generating logic
out of select.c into order.c, so that the temporaries can
be reclaimed.
With the new temporaries, can re-enable the 1-entry
select optimization. Fixes issue 7672.
While we're here, fix a 1-line bug in select processing
turned up by the new liveness test (but unrelated; select.c:72).
Fixes#7686.
6. Clean temporaries (but not particularly promptly) in switch
and range statements.
7. Clean temporary used during convT2E/convT2I.
8. Clean temporaries promptly during && and || expressions.
---
CL 81940043 reduced the number of ambiguously live temps
in the godoc binary from 860 to 711.
CL 83090046 reduced the number from 711 to 121.
This CL reduces the number from 121 to 23.
15 the 23 that remain are in fact ambiguously live.
The final 8 could be fixed but are not trivial and
not common enough to warrant work at this point
in the release cycle.
These numbers only count ambiguously live temps,
not ambiguously live user-declared variables.
There are 18 such variables in the godoc binary after this CL,
so a total of 41 ambiguously live temps or user-declared
variables.
The net effect is that zeroing anything on entry to a function
should now be a rare event, whereas earlier it was the
common case.
This is good enough for Go 1.3, and probably good
enough for future releases too.
Fixes#7345.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83000048
DragonFlyBSD, FreeBSD 9 and beyond, NetBSD 6 and beyond, and
Solaris (illumos) support AF_UNIX+SOCK_SEQPACKET socket.
LGTM=dave
R=golang-codereviews, dave
CC=golang-codereviews
https://golang.org/cl/83390043
This CL tries to fill the gap between Linux and other Unix-like systems
in the same way UDPConn already did.
Fixes#7677.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/83330045
1. In functions with heap-allocated result variables or with
defer statements, the return sequence requires more than
just a single RET instruction. There is an optimization that
arranges for all returns to jump to a single copy of the return
epilogue in this case. Unfortunately, that optimization is
fundamentally incompatible with PC-based liveness information:
it takes PCs at many different points in the function and makes
them all land at one PC, making the combined liveness information
at that target PC a mess. Disable this optimization, so that each
return site gets its own copy of the 'call deferreturn' and the
copying of result variables back from the heap.
This removes quite a few spurious 'ambiguously live' variables.
2. Let orderexpr allocate temporaries that are passed by address
to a function call and then die on return, so that we can arrange
an appropriate VARKILL.
2a. Do this for ... slices.
2b. Do this for closure structs.
2c. Do this for runtime.concatstring, which is the implementation
of large string additions. Change representation of OADDSTR to
an explicit list in typecheck to avoid reconstructing list in both
walk and order.
3. Let orderexpr allocate the temporary variable copies used for
range loops, so that they can be killed when the loop is over.
Similarly, let it allocate the temporary holding the map iterator.
CL 81940043 reduced the number of ambiguously live temps
in the godoc binary from 860 to 711.
This CL reduces the number to 121. Still more to do, but another
good checkpoint.
Update #7345
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83090046
There's enough jitter in the scheduler on overloaded machines
that 25ms is not enough.
LGTM=dave
R=golang-codereviews, gobot, rsc, dave
CC=golang-codereviews
https://golang.org/cl/83300044
REP MOVSQ and REP STOSQ have a really high startup overhead.
Use a Duff's device to do the repetition instead.
benchmark old ns/op new ns/op delta
BenchmarkClearFat32 7.20 1.60 -77.78%
BenchmarkCopyFat32 6.88 2.38 -65.41%
BenchmarkClearFat64 7.15 3.20 -55.24%
BenchmarkCopyFat64 6.88 3.44 -50.00%
BenchmarkClearFat128 9.53 5.34 -43.97%
BenchmarkCopyFat128 9.27 5.56 -40.02%
BenchmarkClearFat256 13.8 9.53 -30.94%
BenchmarkCopyFat256 13.5 10.3 -23.70%
BenchmarkClearFat512 22.3 18.0 -19.28%
BenchmarkCopyFat512 22.0 19.7 -10.45%
BenchmarkCopyFat1024 36.5 38.4 +5.21%
BenchmarkClearFat1024 35.1 35.0 -0.28%
TODO: use for stack frame zeroing
TODO: REP prefixes are still used for "reverse" copying when src/dst
regions overlap. Might be worth fixing.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews, r
https://golang.org/cl/81370046
The old code was using the PC of the instruction after the CALL.
Variables live during the call but not live when it returns would
not be seen as live during the stack copy, which might lead to
corruption. The correct PC to use is the one just before the
return address. After this CL the lookup matches what mgc0.c does.
The only time this matters is if you have back to back CALL instructions:
CALL f1 // x live here
CALL f2 // x no longer live
If a stack copy occurs during the execution of f1, the old code will
use the liveness bitmap intended for the execution of f2 and will not
treat x as live.
The only way this situation can arise and cause a problem in a stack copy
is if x lives on the stack has had its address taken but the compiler knows
enough about the context to know that x is no longer needed once f1
returns. The compiler has never known that much, so using the f2 context
cannot currently cause incorrect execution. For the same reason, it is not
possible to write a test for this today.
CL 83090046 will make the compiler precise enough in some cases
that this distinction will start mattering. The existing stack growth tests
in package runtime will fail if that CL is submitted without this one.
While we're here, print the frame PC in debug mode and update the
bitmap interpretation strings.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83250043