Put call stubs at the beginning (instead of the end). So the
trampoline pass knows the addresses of the stubs, and it can
insert trampolines when necessary.
Fixes#19425.
Change-Id: I1e06529ef837a6130df58917315610d45a6819ca
Reviewed-on: https://go-review.googlesource.com/38131
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
The line between ssa.Func and ssa.Config has blurred.
Concurrent compilation in the backend will require more precision.
This CL lays out an (aspirational) organization.
The implementation will come in follow-up CLs,
once the organization is settled.
ssa.Config holds basic compiler configuration,
mostly arch-specific information.
It is configured once, early on, and is readonly,
so it is safe for concurrent use.
ssa.Func is a single-shot object used for
compiling a single Func. It is not concurrency-safe
and not re-usable.
ssa.Cache is a multi-use object used to avoid
expensive allocations during compilation.
Each ssa.Func is given an ssa.Cache to use.
ssa.Cache is not concurrency-safe.
Change-Id: Id02809b6f3541541cac6c27bbb598834888ce1cc
Reviewed-on: https://go-review.googlesource.com/38160
Reviewed-by: Keith Randall <khr@golang.org>
Previously we always issued a spill right after the op
that was being spilled. This CL pushes spills father away
from the generator, hopefully pushing them into unlikely branches.
For example:
x = ...
if unlikely {
call ...
}
... use x ...
Used to compile to
x = ...
spill x
if unlikely {
call ...
restore x
}
It now compiles to
x = ...
if unlikely {
spill x
call ...
restore x
}
This is particularly useful for code which appends, as the only
call is an unlikely call to growslice. It also helps for the
spills needed around write barrier calls.
The basic algorithm is walk down the dominator tree following a
path where the block still dominates all of the restores. We're
looking for a block that:
1) dominates all restores
2) has the value being spilled in a register
3) has a loop depth no deeper than the value being spilled
The walking-down code is iterative. I was forced to limit it to
searching 100 blocks so it doesn't become O(n^2). Maybe one day
we'll find a better way.
I had to delete most of David's code which pushed spills out of loops.
I suspect this CL subsumes most of the cases that his code handled.
Generally positive performance improvements, but hard to tell for sure
with all the noise. (compilebench times are unchanged.)
name old time/op new time/op delta
BinaryTree17-12 2.91s ±15% 2.80s ±12% ~ (p=0.063 n=10+10)
Fannkuch11-12 3.47s ± 0% 3.30s ± 4% -4.91% (p=0.000 n=9+10)
FmtFprintfEmpty-12 48.0ns ± 1% 47.4ns ± 1% -1.32% (p=0.002 n=9+9)
FmtFprintfString-12 85.6ns ±11% 79.4ns ± 3% -7.27% (p=0.005 n=10+10)
FmtFprintfInt-12 91.8ns ±10% 85.9ns ± 4% ~ (p=0.203 n=10+9)
FmtFprintfIntInt-12 135ns ±13% 127ns ± 1% -5.72% (p=0.025 n=10+9)
FmtFprintfPrefixedInt-12 167ns ± 1% 168ns ± 2% ~ (p=0.580 n=9+10)
FmtFprintfFloat-12 249ns ±11% 230ns ± 1% -7.32% (p=0.000 n=10+10)
FmtManyArgs-12 504ns ± 7% 506ns ± 1% ~ (p=0.198 n=9+9)
GobDecode-12 6.95ms ± 1% 7.04ms ± 1% +1.37% (p=0.001 n=10+10)
GobEncode-12 6.32ms ±13% 6.04ms ± 1% ~ (p=0.063 n=10+10)
Gzip-12 233ms ± 1% 235ms ± 0% +1.01% (p=0.000 n=10+9)
Gunzip-12 40.1ms ± 1% 39.6ms ± 0% -1.12% (p=0.000 n=10+8)
HTTPClientServer-12 227µs ± 9% 221µs ± 5% ~ (p=0.114 n=9+8)
JSONEncode-12 16.1ms ± 2% 15.8ms ± 1% -2.09% (p=0.002 n=9+8)
JSONDecode-12 61.8ms ±11% 57.9ms ± 1% -6.30% (p=0.000 n=10+9)
Mandelbrot200-12 4.30ms ± 3% 4.28ms ± 1% ~ (p=0.203 n=10+8)
GoParse-12 3.18ms ± 2% 3.18ms ± 2% ~ (p=0.579 n=10+10)
RegexpMatchEasy0_32-12 76.7ns ± 1% 77.5ns ± 1% +0.92% (p=0.002 n=9+8)
RegexpMatchEasy0_1K-12 239ns ± 3% 239ns ± 1% ~ (p=0.204 n=10+10)
RegexpMatchEasy1_32-12 71.4ns ± 1% 70.6ns ± 0% -1.15% (p=0.000 n=10+9)
RegexpMatchEasy1_1K-12 383ns ± 2% 390ns ±10% ~ (p=0.181 n=8+9)
RegexpMatchMedium_32-12 114ns ± 0% 113ns ± 1% -0.88% (p=0.000 n=9+8)
RegexpMatchMedium_1K-12 36.3µs ± 1% 36.8µs ± 1% +1.59% (p=0.000 n=10+8)
RegexpMatchHard_32-12 1.90µs ± 1% 1.90µs ± 1% ~ (p=0.341 n=10+10)
RegexpMatchHard_1K-12 59.4µs ±11% 57.8µs ± 1% ~ (p=0.968 n=10+9)
Revcomp-12 461ms ± 1% 462ms ± 1% ~ (p=1.000 n=9+9)
Template-12 67.5ms ± 1% 66.3ms ± 1% -1.77% (p=0.000 n=10+8)
TimeParse-12 314ns ± 3% 309ns ± 0% -1.56% (p=0.000 n=9+8)
TimeFormat-12 340ns ± 2% 331ns ± 1% -2.79% (p=0.000 n=10+10)
The go binary is 0.2% larger. Not really sure why the size
would change.
Change-Id: Ia5116e53a3aeb025ef350ffc51c14ae5cc17871c
Reviewed-on: https://go-review.googlesource.com/34822
Reviewed-by: David Chase <drchase@google.com>
Interface wrapper functions now get compiled eagerly in some cases.
Consequently, they may be present in multiple translation units.
Mark them as DUPOK, just like closures.
Fixes#19548Fixes#19550
Change-Id: Ibe74adb5a62dbf6447db37fde22dcbb3479969ef
Reviewed-on: https://go-review.googlesource.com/38156
Reviewed-by: David Chase <drchase@google.com>
In the SSA CFG, TEXT, RET, and JMP instructions correspond to Blocks,
not Values. Rework liveness analysis so that progeffects only cares
about Progs that result from Values, and handle Blocks separately.
Passes toolstash-check -all.
Change-Id: Ic23719c75b0421fdb51382a08dac18c3ba042b32
Reviewed-on: https://go-review.googlesource.com/38085
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
POSIX Shell only supports = to compare variables inside '[' tests. But
this is Bash, where == is an alias for =. In practice they're the same,
but the current form is inconsisnent and breaks POSIX for no good
reason.
Change-Id: I38fa7a5a90658dc51acc2acd143049e510424ed8
Reviewed-on: https://go-review.googlesource.com/38031
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The fmtmode and fmtpkgpfx globals stand in the
way of making the compiler more concurrent (#15756).
This CL removes them.
The natural way to eliminate a global is to explicitly
thread it as a parameter through all function calls.
However, most of the functions in gc/fmt.go
get called indirectly, by way of fmt format strings,
so there's nowhere natural to add a parameter.
Since there are only a few fmtmode modes,
use named types to distinguish between modes.
For example, fmtNodeErr, fmtNodeDbg, and fmtNodeTypeId
are all gc.Node, but they print in different modes.
Varying the type allows us to thread mode through fmt.
Handle fmtpkgpfx by converting it to a printing mode,
FTypeIdName, and using the same type-based approach.
To avoid a loss of readability and danger of bugs
from introducing conversions at all call sites,
instead add a helper that systematically modifies the args.
The only remaining gc/fmt.go global is dumpdepth.
Since that is used for debugging only,
it that can be handled with a global mutex,
or some similarly basic, if inefficient, protection.
Passes toolstash -cmp. No compiler performance impact.
For future reference, other options for threading state
that were considered and rejected:
* Wrapping values in structs, such as:
type fmtNode struct {
n *Node
mode fmtMode
}
This reduces the proliferation of types, and supports
easily adding extra local parameters.
However, putting such a struct into an interface{} allocates.
This is unacceptable in this particular area of code.
* Passing state via precision, such as:
fmt.Fprintf("%*v", mode, n)
where mode is the state encoded as an integer.
This avoids extra allocations, but it is out of keeping
with the intended semantics of precision, and is less readable.
* Modify the fmt package to support setting/getting context
via fmt.State. Unavailable due to Go 1 compatibility,
and probably the wrong solution anyway.
* Give up on package fmt. This would be a huge readability
regression and cause high code churn.
* Attempt a de-novo rewrite that circumvents these problems.
Too high a risk of bugs, with insufficient reward for the effort,
particularly since long term plans call for elimination
of gc.Node.
Change-Id: Iea2440d5a34a938e64273707de27e3a897cb41d1
Reviewed-on: https://go-review.googlesource.com/38147
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
With this change, code like
h := sha1.New()
h.Write(buf)
sum := h.Sum()
gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.
The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.
The most common pattern identified by the compiler
is a constructor like
func New() Interface { return &impl{...} }
where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.
Some existing benchmarks that change on darwin/amd64:
Crc64/ISO4KB-8 2.67µs ± 1% 2.66µs ± 0% -0.36% (p=0.015 n=10+10)
Crc64/ISO1KB-8 694ns ± 0% 690ns ± 1% -0.59% (p=0.001 n=10+10)
Adler32KB-8 473ns ± 1% 471ns ± 0% -0.39% (p=0.010 n=10+9)
On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.
Updates #19361
Change-Id: I57d4dc21ef40a05ec0fbd55a9bb0eb74cdc67a3d
Reviewed-on: https://go-review.googlesource.com/38139
Run-TryBot: Philip Hofer <phofer@umich.edu>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Changes to ${GOARCH}Ops.go files were mechanically produced using
github.com/mdempsky/ssa-symops, a one-off tool that inserts
"SymEffect: X" elements by pattern matching against the Op names.
Change-Id: Ibf3e481ffd588647f2a31662d72114b740ccbfcf
Reviewed-on: https://go-review.googlesource.com/38084
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
To replace the progeffects tables for liveness analysis.
Change-Id: Idc4b990665cb0a9aa300d62cdf8ad12e51c5b991
Reviewed-on: https://go-review.googlesource.com/38083
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Bad pragmas should never make it to the backend.
I've confirmed manually that the error position is unchanged.
Updates #15756
Updates #19250
Change-Id: If14f7ce868334f809e337edc270a49680b26f48e
Reviewed-on: https://go-review.googlesource.com/38152
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Retry the test several times with increasingly long timeouts.
Fixes#19538 (hopefully)
Change-Id: Ia3bf2b63b4298a6ee1e4082e14d9bfd5922c293a
Reviewed-on: https://go-review.googlesource.com/38154
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
There were a surprising number of places
in the tree that used yyerror for failed internal
consistency checks. Switch them to Fatalf.
Updates #15756
Updates #19250
Change-Id: Ie4278148185795a28ff3c27dacffc211cda5bbdd
Reviewed-on: https://go-review.googlesource.com/38153
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This reverts commit 4e0c7c3f61.
Reason for revert: The presence-of-optimization test program is fragile, breaks under noopt, and might break if the Go libraries are tweaked. It needs to be (re)written without reference to other packages.
Change-Id: I3aaf1ab006a1a255f961a978e9c984341740e3c7
Reviewed-on: https://go-review.googlesource.com/38097
Reviewed-by: Keith Randall <khr@golang.org>
This abstracts creation of ACALL Progs into package gc. The main
benefit of this today is we can refactor away a lot of common
boilerplate code.
Later, once liveness analysis happens on the SSA graph, this will also
provide an easy insertion point for emitting the PCDATA Progs
immediately before call instructions.
Passes toolstash-check -all.
Change-Id: Ia15108ace97201cd84314f1ca916dfeb4f09d61c
Reviewed-on: https://go-review.googlesource.com/38081
Reviewed-by: Keith Randall <khr@golang.org>
Tinkering with the gob package shows that is currently possible to
*completely destroy* Int slices encoding without triggering a single
test failure.
The various encInt{8,16,32,64}Slice methods are only called during the
execution of the GobMapInterfaceEncode test, which only encodes a few
slices of length exactly 1 and then just checks that the error
returned by Encode is nil (without trying to Decode back the data).
This patch adds a few tests for signed integer slices encoding.
Change-Id: Ifaaee2f32132873118b241f79aa8203e4ad31416
Reviewed-on: https://go-review.googlesource.com/38066
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Move the zeroing of results earlier. In particular, they need to
come before any move-to-heap operations, as those require allocation.
Those allocations are points at which the GC can see the uninitialized
result slots.
For the function:
func f() (x, y, z *int) {
defer(){}()
escape(&y)
return
}
We used to generate code like this:
x = nil
y = nil
&y = new(int)
z = nil
Now we will generate:
x = nil
y = nil
z = nil
&y = new(int)
Since the fix for #18860, the return slots are always live if there
is a defer, so the former ordering allowed the GC to see junk
in the z slot.
Fixes#19078
Change-Id: I71554ae437549725bb79e13b2c100b2911d47ed4
Reviewed-on: https://go-review.googlesource.com/38133
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Since https://go-review.googlesource.com/24040 we no longer pad functions
in asm6, so funcAlign is unused. Delete it.
Change-Id: Id710e545a76b1797398f2171fe7e0928811fcb31
Reviewed-on: https://go-review.googlesource.com/38134
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The existing implementation started by eliminating
nil checks for OpAddr, OpAddPtr, and OpPhis with
all non-nil args.
However, some OpPhis had all non-nil args,
but their args had not been processed yet.
Pull the OpPhi checks into their own loop,
and repeat until stabilization.
Eliminates a dozen additional nilchecks during make.bash.
Negligible compiler performance impact.
Change-Id: If7b803c3ad7582af7d9867d05ca13e03e109d864
Reviewed-on: https://go-review.googlesource.com/37999
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
With this change, code like
h := sha1.New()
h.Write(buf)
sum := h.Sum()
gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.
The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.
The most common pattern identified by the compiler
is a constructor like
func New() Interface { return &impl{...} }
where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.
Some existing benchmarks that change on darwin/amd64:
Crc64/ISO4KB-8 2.67µs ± 1% 2.66µs ± 0% -0.36% (p=0.015 n=10+10)
Crc64/ISO1KB-8 694ns ± 0% 690ns ± 1% -0.59% (p=0.001 n=10+10)
Adler32KB-8 473ns ± 1% 471ns ± 0% -0.39% (p=0.010 n=10+9)
On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.
Updates #19361
Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
Reviewed-on: https://go-review.googlesource.com/37751
Run-TryBot: Philip Hofer <phofer@umich.edu>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The (original) section on "Operators and Delimiters" introduced
superfluous terminology ("delimiter", "special token") which
didn't matter and was used inconsistently.
Removed any mention of "delimiter" or "special token" and now
simply group the special character tokens into "operators"
(clearly defined via links), and "punctuation" (everything else).
Fixes#19450.
Change-Id: Ife31b24b95167ace096f93ed180b7eae41c66808
Reviewed-on: https://go-review.googlesource.com/38073
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Rob Pike <r@golang.org>
Fix last proxy in TestProxyFromEnvironment bleeds into other tests
Change ResetProxyEnv to use the newer os.Unsetenv, instead of hard
coding as ""
Change-Id: I67cf833dbcf4bec2e10ea73c354334160cf05f84
Reviewed-on: https://go-review.googlesource.com/38115
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In Go 1.7 and earlier, gc.exportsize tracked the number of bytes
written through exportf. With the removal of the old exporter in Go 1.8
exportf is only used for printing the build id, and the header and
trailer of the binary export format. The size of the export data is
now returned directly from the exporter and exportsize is never
referenced. Remove it.
Change-Id: Id301144b3c26c9004c722d0c55c45b0e0801a88c
Reviewed-on: https://go-review.googlesource.com/38116
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
selectTag has been hard coded to only understand the tag `go1` since
CL 6112060 which landed in 2012. The commit message asserted;
Right now (before go1.0.1) there is only one possible tag,
"go1", and I'd like to keep it that way.
Remove goTag and the unused matching code in selectTag.
Change-Id: I85f7c10f95704e22f8e8681266afd72bbcbe8fbd
Reviewed-on: https://go-review.googlesource.com/38112
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
CL 32219 added precomputed sizeclass tables.
Remove the unused sizeToClass method which was previously only
called from initSizes.
Change-Id: I907bf9ed78430ecfaabbec7fca77ef2375010081
Reviewed-on: https://go-review.googlesource.com/38113
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Some of the changes in CL golang.org/cl/38071/ assumed that / and %
could always be combined to use only one DIV instruction. However,
this is not the case for 64bit operands on a 32bit platform which use
seperate runtime functions to calculate division and modulo.
This CL restores the original optimizations that help on 32bit platforms
with negligible impact on 64bit platforms.
386:
name old time/op new time/op delta
FormatInt-2 6.06µs ± 0% 6.02µs ± 0% -0.70% (p=0.000 n=20+20)
AppendInt-2 4.98µs ± 0% 4.98µs ± 0% ~ (p=0.747 n=18+18)
FormatUint-2 1.93µs ± 0% 1.85µs ± 0% -4.19% (p=0.000 n=20+20)
AppendUint-2 1.71µs ± 0% 1.64µs ± 0% -3.68% (p=0.000 n=20+20)
amd64:
name old time/op new time/op delta
FormatInt-2 2.41µs ± 0% 2.41µs ± 0% -0.09% (p=0.010 n=18+18)
AppendInt-2 1.77µs ± 0% 1.77µs ± 0% +0.08% (p=0.000 n=18+18)
FormatUint-2 653ns ± 1% 653ns ± 0% ~ (p=0.178 n=20+20)
AppendUint-2 514ns ± 0% 513ns ± 0% -0.13% (p=0.000 n=20+17)
Change-Id: I574a18e54fb41b25fbe51ce696e7a8765abc79a6
Reviewed-on: https://go-review.googlesource.com/38051
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
This appears to be leftover from when instruction selection happened
in the linker. Many of the morestackX functions listed don't even
exist anymore.
Now that we select instructions within the compiler and assembler,
normal deadcode elimination mechanisms should suffice for these
symbols.
Change-Id: I2cb1e435101392e7c983957c4acfbbcc87a5ca7d
Reviewed-on: https://go-review.googlesource.com/38077
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Determine int, uint and uintptr bit sizes from GOARCH environment
variable if it is set. Otherwise use host-specific sizes.
Fixes#19321
Change-Id: I494b8e4b49b59d32794f50ff2ce06ba040cb8460
Reviewed-on: https://go-review.googlesource.com/37950
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This avoids a problem that occurs on FreeBSD 11, in which the clang
3.8 assembler issues a pointless warning when invoked with -g on a
file that contains an empty .note.GNU-stack section.
No test because there is no reasonable way to write one, but should
fix the build on FreeBSD 11.
Fixes#14705.
Change-Id: I8c49bbf79a2c715c0e75495da19045fc92280e81
Reviewed-on: https://go-review.googlesource.com/38072
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
nat.setUint64 is nicely generic.
By assuming 32- or 64-bit words, however,
we can write simpler code,
and eliminate some shifts
in dead code that vet complains about.
Generated code for 64 bit systems is unaltered.
Generated code for 32 bit systems is much better.
For 386, the routine length drops from 325
bytes of code to 271 bytes of code, with fewer loops.
Change-Id: I1bc14c06272dee37a7fcb48d33dd1e621eba945d
Reviewed-on: https://go-review.googlesource.com/38070
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The compiler recognizes that in a sequence q = x/y; r = x%y only
one division is required. Remove prior work-arounds and write
more readable straight-line code (this also results in fewer
instructions, though it doesn't appear to affect the benchmarks
significantly).
name old time/op new time/op delta
FormatInt-8 2.95µs ± 1% 2.92µs ± 5% ~ (p=0.952 n=5+5)
AppendInt-8 1.91µs ± 1% 1.89µs ± 2% ~ (p=0.421 n=5+5)
FormatUint-8 795ns ± 2% 782ns ± 4% ~ (p=0.444 n=5+5)
AppendUint-8 557ns ± 1% 557ns ± 2% ~ (p=0.548 n=5+5)
https://perf.golang.org/search?q=upload:20170310.1
Also:
- use uint instead of uintptr where we want to guarantee single-
register operations
- remove some unnecessary conversions (before indexing)
- add more comments and fix some comments
Change-Id: I04858dc2d798a6495879d9c7cfec2fdc2957b704
Reviewed-on: https://go-review.googlesource.com/38071
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In FreeBSD when run Go proc under a given sub-list of
processors(e.g. 'cpuset -l 0 ./a.out' in multi-core system),
runtime.NumCPU() still return all physical CPUs from sysctl
hw.ncpu instead of account from sub-list.
Fix by use syscall cpuset_getaffinity to account the number of sub-list.
Fixes#15206
Change-Id: If87c4b620e870486efa100685db5debbf1210a5b
Reviewed-on: https://go-review.googlesource.com/29341
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
We are vendoring pprof from github.com/google/pprof, which comes with
a main package. If we don't explicitly skip that main package, then
`go install cmd` will install the compiled program in $GOROOT/bin.
Fixes#19441.
Change-Id: Ib268ffd16d4be65f7d80e4f8d9dc6e71523a94de
Reviewed-on: https://go-review.googlesource.com/38007
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Raul Silvera <rsilvera@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>