Intrinsify these functions to match other platforms. Update the
sequence of instructions used in the assembly implementations to
match the intrinsics.
Also, add a micro benchmark so we can more easily measure the
performance of these two functions:
name old time/op new time/op delta
And8-8 5.33ns ± 7% 2.55ns ± 8% -52.12% (p=0.000 n=20+20)
And8Parallel-8 7.39ns ± 5% 3.74ns ± 4% -49.34% (p=0.000 n=20+20)
Or8-8 4.84ns ±15% 2.64ns ±11% -45.50% (p=0.000 n=20+20)
Or8Parallel-8 7.27ns ± 3% 3.84ns ± 4% -47.10% (p=0.000 n=19+20)
By using a 'rotate then xor selected bits' instruction combined with
either a 'load and and' or a 'load and or' instruction we can
implement And8 and Or8 with far fewer instructions. Replacing
'compare and swap' with atomic instructions may also improve
performance when there is contention.
Change-Id: I28bb8032052b73ae8ccdf6e4c612d2877085fa01
Reviewed-on: https://go-review.googlesource.com/c/go/+/204277
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
On some platforms (currently ARM and ARM64), when calling into
VDSO we store the G to the gsignal stack, if there is one, so if
we receive a signal during VDSO we can find the G.
If we receive a signal during VDSO, and within the signal handler
we call nanotime again (e.g. when handling profiling signal),
we'll save/clear the G slot on the gsignal stack again, which
clobbers the original saved G. If we receive a second signal
during the same VDSO execution, we will fetch a nil G, which will
lead to bad things such as deadlock.
Don't save G if we're calling VDSO code from the gsignal stack.
Saving G is not necessary as we won't receive a nested signal.
Fixes#35473.
Change-Id: Ibfd8587a3c70c2f1533908b056e81b94d75d65a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/206397
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Renames variables sizeof_Array and other array_* variables
that were actually intended for slices and not arrays.
Change-Id: I391b95880cc77cabb8472efe694b7dd19545f31a
Reviewed-on: https://go-review.googlesource.com/c/go/+/180919
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This reverts CL 129118 (commit aff3aaa47f)
Reason for revert: It was retracted by the author in a comment on the PR
but that doesn't get synced to Gerrit, and the Gerrit CL wasn't closed
when the PR was closed.
Change-Id: I5ad16e96f98a927972187dc5c9df3a0e9b9fafa8
Reviewed-on: https://go-review.googlesource.com/c/go/+/206377
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
This is intended to allow IDEs to note where the optimizer
was not able to improve users' code. There may be other
applications for this, for example in studying effectiveness
of optimizer changes more quickly than running benchmarks,
or in verifying that code changes did not accidentally disable
optimizations in performance-critical code.
Logging of nilcheck (bad) for amd64 is implemented as
proof-of-concept. In general, the intent is that optimizations
that didn't happen are what will be logged, because that is
believed to be what IDE users want.
Added flag -json=version,dest
Check that version=0. (Future compilers will support a
few recent versions, I hope that version is always <=3.)
Dest is expected to be one of:
/path (or \path in Windows)
will create directory /path and fill it w/ json files
file://path
will create directory path, intended either for
I:\dont\know\enough\about\windows\paths
trustme_I_know_what_I_am_doing_probably_testing
Not passing an absolute path name usually leads to
json splattered all over source directories,
or failure when those directories are not writeable.
If you want a foot-gun, you have to ask for it.
The JSON output is directed to subdirectories of dest,
where each subdirectory is net/url.PathEscape of the
package name, and each for each foo.go in the package,
net/url.PathEscape(foo).json is created. The first line
of foo.json contains version and context information,
and subsequent lines contains LSP-conforming JSON
describing the missing optimizations.
Change-Id: Ib83176a53a8c177ee9081aefc5ae05604ccad8a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/204338
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This change makes go env -w and -u check invalid GOOS and GOARCH values and abort if that's the case.
Fixes#34194
Change-Id: Idca8e93bb0b190fd273bf786c925be7993c24a2b
GitHub-Last-Rev: ee67f09d75
GitHub-Pull-Request: golang/go#34221
Reviewed-on: https://go-review.googlesource.com/c/go/+/194617
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
The variable now implies that the next tick always
returns the current time which is not always the case.
Change it to next to clarify that it returns
the time of the next tick which is more appropriate.
Fixes#30271
Change-Id: Ie7719cb8c7180bc6345b436f9b3e950ee349d6e4
Reviewed-on: https://go-review.googlesource.com/c/go/+/206123
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
This change makes the test addresses start at 1 GiB instead of 2 GiB to
support mips and mipsle, which only have 31-bit address spaces.
It also changes some tests to use smaller offsets for the chunk index to
avoid jumping too far ahead in the address space to support 31-bit
address spaces. The tests don't require such large jumps for what
they're testing anyway.
Updates #35112.
Fixes#35440.
Change-Id: Ic68ff2b0a1f10ef37ac00d4bb5b910ddcdc76f2e
Reviewed-on: https://go-review.googlesource.com/c/go/+/205938
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
os.OpenFile was assuming that a failed syscall.Open means the file does
not exist and it tries to create it. However, syscall.Open may have
failed for some other reason, such as failing to lock a os.ModeExclusive
file. We change os.OpenFile to only create the file if the error
indicates that the file doesn't exist.
Remove skip of TestTransform test, which was failing because sometimes
syscall.Open would fail due to the file being locked, but the
syscall.Create would succeed because the file is no longer locked. The
create was truncating the file.
Fixes#35471
Change-Id: I06583b5f8ac33dc90a51cc4fb64f2d8d9c0c2113
Reviewed-on: https://go-review.googlesource.com/c/go/+/206299
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Setting InsecureSkipVerify and VerifyPeerCertificate is the recommended
way to customize and override certificate validation.
However, there is boilerplate involved and it usually requires first
reimplementing the default validation strategy to then customize it.
Provide an example that does the same thing as the default as a starting
point.
Examples of where we directed users to do something similar are in
issues #35467, #31791, #28754, #21971, and #24151.
Fixes#31792
Change-Id: Id033e9fa3cac9dff1f7be05c72dfb34b4f973fd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/193620
Reviewed-by: Adam Langley <agl@golang.org>
When we have already assigned the semaphore ticket to a specific
waiter, we want to get the waiter running as fast as possible since
no other G waiting on the semaphore can acquire it optimistically.
The net effect is that, when a sync.Mutex is contended, the code in
the critical section guarded by the Mutex gets a priority boost.
Fixes#33747
The original work was done in CL 200577 by Carlo Alberto Ferraris. The
change was reverted in CL 205817 because it broke the linux-arm64-packet
and solaris-amd64-oraclerel builders.
Change-Id: I76d79b1d63fd206ed1c57fe6900cb7ae9e4d46cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/206180
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
On MIPS, Linux returns whether the syscall had an error in a separate
register (R7), not using a negative return value as on other
architectures. Thus, skip TestSyscallNoError as there is no error case
for syscall.RawSyscall which it could test against.
Also reformat the error output so the expected and gotten values are
aligned so they're easier to compare.
Fixes#35422
Change-Id: Ibc88f7c5382bb7ee8faf15ad4589ca1f9f017a06
Reviewed-on: https://go-review.googlesource.com/c/go/+/205898
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This restores intrinsic status to functions copied from math/bits
into runtime/internal/sys, as an aid to runtime performance.
Updates #35112.
Change-Id: I41a7d87cf00f1e64d82aa95c5b1000bc128de820
Reviewed-on: https://go-review.googlesource.com/c/go/+/206200
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Now that the runtime can send preemption signals, it is possible that
a channel that asks for all signals can see both SIGURG and SIGHUP
before reading either, in which case one of the signals will be dropped.
We have to use a larger buffer so that the test see the signal it expects.
Fixes#35466
Change-Id: I36271eae0661c421780c72292a5bcbd443ada987
Reviewed-on: https://go-review.googlesource.com/c/go/+/206257
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
CL 201765 activated calls from the runtime to functions in math/bits.
When coverage and race detection were simultaneously enabled,
this caused a crash when the covered+race-checked code in
math/bits was called from the runtime before there was even a P.
PS Win for gdlv in helping sort this out.
TODO - next CL intrinsifies the new functions in
runtime/internal/sys
TODO/Would-be-nice - Ctz64 and TrailingZeros64 are the same
function; 386.s is intrinsified; clean all that up.
Fixes#35461.
Updates #35112.
Change-Id: I750a54dba493130ad3e68a06530ede7687d41e1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/206199
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Previously langSupported applied -lang as though it's a global
restriction, but it's actually a per-package restriction. This CL
fixes langSupported to take a *types.Pkg parameter to reflect this and
updates its callers accordingly.
This is relevant for signed shifts (added in Go 1.12), because they
can be inlined into a Go 1.11 package; and for overlapping interfaces
(added in Go 1.13), because they can be exported as part of the
package's API.
Today we require all Go packages to be compiled with the same
toolchain, and all uses of langSupported are for controlling
backwards-compatible features. So we can simply assume that since the
imported packages type-checked successfully, they must have been
compiled with an appropriate -lang setting.
In the future if we ever want to use langSupported to control
backwards-incompatible language changes, we might need to record the
-lang flag used for compiling a package in its export data.
Fixes#35437.
Fixes#35442.
Change-Id: Ifdf6a62ee80cd5fb4366cbf12933152506d1b36e
Reviewed-on: https://go-review.googlesource.com/c/go/+/205977
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Unlike function calls, when processing instructions that directly
fault we must not subtract 1 from the pc before looking up the
file/line information.
Since the file/line lookup unconditionally subtracts 1, add 1 to
the faulting instruction PCs to compensate.
Fixes#34123
Change-Id: Ie7361e3d2f84a0d4f48d97e5a9e74f6291ba7a8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/196962
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Changing the Import function to return a PackageNotInModuleError if no
package was found in a local module. This replacing the vague message
"missing dot in first path element" you get today with much more friendly
one - "module was found, but does not contain package".
Fixes#35273
Change-Id: I6d726c17e6412258274b10f58f76621617d26e0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/203118
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
This adds pipe/pipe2 on Solaris as they exist on other Unix systems.
They were not added previously because Solaris does not need them
for netpollBreak. They are added now in preparation for using pipes
in TestSignalM.
Updates #35276
Change-Id: I53dfdf077430153155f0a79715af98b0972a841c
Reviewed-on: https://go-review.googlesource.com/c/go/+/206077
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Support "gzip" aka "x-gzip" as a transfer-encoding for
requests and responses as per RFC 7230 Section 3.3.1.
"gzip" and "x-gzip" are equivalents as requested by
RFC 7230 Section 4.2.3.
Transfer-Encoding is an on-fly property of the body
that can be applied by proxies, other servers and basically
any intermediary to transport the content e.g. across data centers
or backends/machine to machine that need compression.
For this change, "gzip" is both explicitly and implicitly combined
with transfer-encoding "chunked" in an ordering such as:
Transfer-Encoding: gzip, chunked
and NOT
Transfer-Encoding: chunked, gzip
Obviously the latter form is counter-intuitive for streaming.
Thus "chunked" is the last value to appear in that transfer-encoding header,
if explicitly included.
When parsing the response, the chunked body is concatenated as "chunked" does,
before finally being decompressed as "gzip".
A chunked and compressed body would typically look like this:
<LENGTH_1>\r\n<CHUNK_1_GZIPPED_BODY>\r\n<LENGTH_2>\r\n<CHUNK_2_GZIPPED_BODY>\0\r\n
which when being processed we would contentate
<FULL_BODY> := <CHUNK_1_GZIPPED_BODY> + <CHUNK_2_GZIPPED_BODY> + ...
and then finally gunzip it
<FINAL_BODY> := gunzip(<FULL_BODY>)
If a "chunked" transfer-encoding is NOT applied but "gzip" is applied,
we implicitly assume that they requested using "chunked" at the end.
This is as per the recommendation of RFC 3.3.1. which explicitly says
that for:
* Request:
" If any transfer coding
other than chunked is applied to a request payload body, the sender
MUST apply chunked as the final transfer coding to ensure that the
message is properly framed."
* Response:
" If any transfer coding other than
chunked is applied to a response payload body, the sender MUST either
apply chunked as the final transfer coding or terminate the message
by closing the connection."
RELNOTE=yes
Fixes#29162
Change-Id: Icb8b8b838cf4119705605b29725cabb1fe258491
Reviewed-on: https://go-review.googlesource.com/c/go/+/166517
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
pregrow result array to avoid small allocation.
Change-Id: Ife5f815efa4c163ecdbb3a4c16bfb60a484dfa11
Reviewed-on: https://go-review.googlesource.com/c/go/+/174706
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Use names to better communicate when a test case fails.
Change-Id: Id882783cb5e444b705443fbcdf612713f8a3b032
Reviewed-on: https://go-review.googlesource.com/c/go/+/187823
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Without this CL, one of the TestDebugCall tests would fail 1% to 2% of
the time on the android-amd64-emu gomote. With this CL, I ran the
tests for 1000 iterations with no failures.
Fixes#32985
Change-Id: I541268a2a0c10d0cd7604f0b2dbd15c1d18e5730
Reviewed-on: https://go-review.googlesource.com/c/go/+/205248
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
This change adds a per-p free page cache which the page allocator may
allocate out of without a lock. The change also introduces a completely
lockless page allocator fast path.
Although the cache contains at most 64 pages (and usually less), the
vast majority (85%+) of page allocations are exactly 1 page in size.
Updates #35112.
Change-Id: I170bf0a9375873e7e3230845eb1df7e5cf741b78
Reviewed-on: https://go-review.googlesource.com/c/go/+/195701
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
This change adds a page cache structure which owns a chunk of free pages
at a given base address. It also adds code to allocate to this cache
from the page allocator. Finally, it adds tests for both.
Notably this change does not yet integrate the code into the runtime,
just into runtime tests.
Updates #35112.
Change-Id: Ibe121498d5c3be40390fab58a3816295601670df
Reviewed-on: https://go-review.googlesource.com/c/go/+/196643
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Make binary.Read return an error when passed `data` argument is not
a pointer to a fixed-size value or a slice of fixed-size values.
Fixes#32927
Change-Id: I04f48be55fe9b0cc66c983d152407d0e42cbcd95
Reviewed-on: https://go-review.googlesource.com/c/go/+/184957
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This change adds a per-p mspan object cache similar to the sudog cache.
Unfortunately this cache can't quite operate like the sudog cache, since
it is used in contexts where write barriers are disallowed (i.e.
allocation codepaths), so rather than managing an array and a slice,
it's just an array and a length. A little bit more unsafe, but avoids
any write barriers.
The purpose of this change is to reduce the number of operations which
require the heap lock in allocation, paving the way for a lockless fast
path.
Updates #35112.
Change-Id: I32cfdcd8528fb7be985640e4f3a13cb98ffb7865
Reviewed-on: https://go-review.googlesource.com/c/go/+/196642
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change combines the functionality of allocSpanLocked, allocManual,
and alloc_m into a new method called allocSpan. While these methods'
abstraction boundaries are OK when the heap lock is held throughout,
they start to break down when we want finer-grained locking in the page
allocator.
allocSpan does just that, and only locks the heap when it absolutely has
to. Piggy-backing off of work in previous CLs to make more of span
initialization lockless, this change makes span initialization entirely
lockless as part of the reorganization.
Ultimately this change will enable us to add a lockless fast path to
allocSpan.
Updates #35112.
Change-Id: I99875939d75fb4e958a67ac99e4a7cda44f06864
Reviewed-on: https://go-review.googlesource.com/c/go/+/196641
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Currently gcSweepBuf guarantees that push operations may be performed
concurrently with each other and that block operations may be performed
concurrently with push operations as well.
Unfortunately, this isn't quite true. The existing code allows push
operations to happen concurrently with each other, but block operations
may return blocks with nil entries. The way this can happen is if two
concurrent pushers grab a slot to push to, and the first one (the one
with the earlier slot in the buffer) doesn't quite write a span value
when the block is called. The existing code in block only checks if the
very last value in the block is nil, when really an arbitrary number of
the last few values in the block may or may not be nil.
Today, this case can't actually happen because when push operations
happen concurrently during a GC (which is the only time block is
called), they only ever happen during an allocation with the heap lock
held, effectively serializing them. A block operation may happen
concurrently with one of these pushes, but its callers will never see a
nil mspan. Outside of a GC, this isn't a problem because although push
operations from allocations can run concurrently with push operations
from sweeping, block operations will never run.
In essence, the real concurrency guarantees provided by gcSweepBuf are
that block operations may happen concurrently with push operations, but
that push operations may not be concurrent with each other if there are
any block operations.
To fix this, and to prepare for push operations happening without the
heap lock held in a future CL, we update the documentation for block to
correctly state that there may be nil entries in the returned slice.
While we're here, make the mspan writes into the buffer atomic to avoid
a block user racing on a nil check, and document that the user should
load mspan values from the returned slice atomically. Finally, we make
all callers of block adhere to the new rules.
We choose to allow nil values rather than filter them out because the
only caller of block is markrootSpans, and if it catches a nil entry,
then there wasn't anything to mark in there anyway since the span is
just being created.
Updates #35112.
Change-Id: I6450aab15f51690d7a000ba5b3d529cf2ca5da1e
Reviewed-on: https://go-review.googlesource.com/c/go/+/203318
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change makes it so that allocation and free related page sweeper
metadata operations (e.g. pageInUse and pagesInUse) are atomic rather
than protected by the heap lock. This will help in reducing the length
of the critical path with the heap lock held in future changes.
Updates #35112.
Change-Id: Ie82bff024204dd17c4c671af63350a7a41add354
Reviewed-on: https://go-review.googlesource.com/c/go/+/196640
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
When the frame size is large, we generate
MOVD.P 0xf0(SP), LR
ADD $(framesize-0xf0), SP
This is problematic: after the first instruction, we have a
partial frame of size (framesize-0xf0). If we try to unwind the
stack at this point, we'll try to read the LR from the stack at
0(SP) (the new SP) as the frame size is not 0. But this slot does
not contain a valid LR.
Fix this by not changing SP in two instructions. Instead,
generate
MOVD (SP), LR
ADD $framesize, SP
This affects not only async preemption but also profiling. So we
change the generated instructions, instead of marking unsafe
point.
Change-Id: I4e78c62d50ffc4acff70ccfbfec16a5ccae17f24
Reviewed-on: https://go-review.googlesource.com/c/go/+/206057
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>