Using MASKEQZ instruction can save one instruction in calculation of
shift operations.
Reference: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html
Change-Id: Ic5349c6f5ebd7af608c7d75a9b3a862305758275
Reviewed-on: https://go-review.googlesource.com/c/go/+/427396
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
The runtime-gdb.py script needs procid to be set in order to
map a goroutine ID with an OS thread. The Go runtime is not currently
setting that variable on Windows, so TestGdbPython (and friends) can't
succeed.
This CL initializes procid and unskips gdb tests on Windows.
Fixes#22687
Updates #21380
Updates #22021
Change-Id: Icd1d9fc1764669ed1bf04f53d17fadfd24ac3f30
Reviewed-on: https://go-review.googlesource.com/c/go/+/470596
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
If -c is set while testing multiple packages, then allow
to build testing binary executables to the current directory
or to the directory that -o refers to.
$ go test -c -o /tmp ./pkg1 ./pkg2 ./pkg2
$ ls /tmp
pkg1.test pkg2.test pkg3.test
Fixes#15513.
Change-Id: I3aba01bebfa90e61e59276f2832d99c0d323b82e
Reviewed-on: https://go-review.googlesource.com/c/go/+/466397
Run-TryBot: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
This CL marks some solaris assembly functions as NOFRAME to avoid
relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions
without stack were also marked as NOFRAME.
While here, I've reduced the stack usage of runtime·sigtramp by
16 bytes to compensate the additional 8 bytes from the stack-allocated
frame pointer. There were two unused 8-byte slots on the stack, one
at 24(SP) and the other at 80(SP).
Updates #58378
Change-Id: If9230e71a8b3c72681ffc82030ade6ceccf824db
Reviewed-on: https://go-review.googlesource.com/c/go/+/466456
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Discover while working on CL 471335.
Change-Id: I006077a5aa93cafb7be47813ab0c4714bb00d774
Reviewed-on: https://go-review.googlesource.com/c/go/+/471435
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
This CL marks some openbsd assembly functions as NOFRAME to avoid
relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions
without stack were also marked as NOFRAME.
Updates #58378
Change-Id: I993549df41a93255fb714357443f8b24c3dfb0a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/466455
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
One major reason to avoid binary.BigEndian is because
the binary package includes a transitive dependency on reflect.
See #54097.
Given that writer.go already depends on the binary package,
embrace use of it consistently where sensible.
We should either embrace use of binary or fully avoid it.
Change-Id: I5f2d27d0ed8cab5ac54be02362c7d33276dd4b9a
Reviewed-on: https://go-review.googlesource.com/c/go/+/452176
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
The mutexes added by CL 235820 aren't sufficient to prevent a race when
an i/o deadline timer fires just as the deadline is being reset to zero.
Consider this possible sequence when goroutine S is clearing the
deadline and goroutine T has been started by the timer:
1. S locks the mutex
2. T blocks on the mutex
3. S sets the timedout flag to false
4. S calls Stop on the timer (and fails, because the timer has fired)
5. S unlocks the mutex
6. T locks the mutex
7. T sets the timedout flag to true
Now all subsequent I/O will timeout, although the deadline has been
cleared.
The fix is for the timeout goroutine to skip setting the timedout
flag if the timer pointer has been cleared, or reassigned by
another SetDeadline operation.
Fixes#57114
Change-Id: I4a45d19c3b4b66cdf151dcc3f70536deaa8216a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/470215
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David du Colombier <0intro@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
This patch backs out CL 467715 (written to fix 58425), now that we
have a better fix for the "relocation doesn't fit" problem in the
trampoline generation phase (send in a previous CL).
Updates #58428.
Updates #58425.
Change-Id: Ib0d966fed00bd04db7ed85aa4e9132382b979a44
Reviewed-on: https://go-review.googlesource.com/c/go/+/471596
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
The folded name logic (despite all attempts to optimize it)
was fundamentally an O(n) operation where every field in a struct
needed to be linearly scanned in order to find a match.
This made unmashaling of unknown fields always O(n).
Instead of optimizing the comparison for each field,
make it such that we can look up a name in O(1).
We accomplish this by maintaining a map keyed by pre-folded names,
which we can pre-calculate when processing the struct type.
Using a stack-allocated buffer, we can fold the input name and
look up its presence in the map.
Also, instead of mapping from names to indexes,
map directly to a pointer to the field information.
The memory cost of this is the same and avoids an extra slice index.
The new logic is both simpler and faster.
Performance:
name old time/op new time/op delta
CodeDecoder 2.47ms ± 4% 2.42ms ± 2% -1.83% (p=0.022 n=10+9)
UnicodeDecoder 259ns ± 2% 248ns ± 1% -4.32% (p=0.000 n=10+10)
DecoderStream 150ns ± 1% 149ns ± 1% ~ (p=0.516 n=10+10)
CodeUnmarshal 3.13ms ± 2% 3.09ms ± 2% -1.37% (p=0.022 n=10+9)
CodeUnmarshalReuse 2.50ms ± 1% 2.45ms ± 1% -1.96% (p=0.001 n=8+9)
UnmarshalString 67.1ns ± 5% 64.5ns ± 5% -3.90% (p=0.005 n=10+10)
UnmarshalFloat64 60.1ns ± 4% 58.4ns ± 2% -2.89% (p=0.002 n=10+8)
UnmarshalInt64 51.0ns ± 4% 49.2ns ± 1% -3.53% (p=0.001 n=10+8)
Issue10335 80.7ns ± 2% 79.2ns ± 1% -1.82% (p=0.016 n=10+8)
Issue34127 28.6ns ± 3% 28.8ns ± 3% ~ (p=0.388 n=9+10)
Unmapped 177ns ± 2% 177ns ± 2% ~ (p=0.956 n=10+10)
Change-Id: I478b2b958f5a63a69c9a991a39cd5ffb43244a2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/471196
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Use append for HTMLEscape similar to Indent and Compact.
Move it to indent.go alongside Compact, as it shares similar logic.
In a future CL, we will modify appendCompact to be written in terms
of appendHTMLEscape, but we need to first move the JSON decoder logic
out of the main loop of appendCompact.
Change-Id: I131c64cd53d5d2b4ca798b37349aeefe17b418c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/471198
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
This patch provides a fix for a problem linking large arm32 binaries
with external linking, specifically R_CALLARM relocations against
runtime.duff* routines being flagged by the external linker as not
reaching.
What appears to be happening in the bug in question is that the Go
linker and the external linker are using slightly different recipes to
decide whether a given R_CALLARM relocation will "fit" (e.g. will not
require a trampoline). The Go linker is taking into account the addend
on the call reloc (which for calls to runtime.duffcopy or
runtime.duffzero is nonzero), whereas the external linker appears to
be ignoring the addend.
Example to illustrate:
Addr Size Func
----- ----- -----
...
XYZ 1024 runtime.duffcopy
...
ABC ... mypackge.MyFunc
+ R0: R_CALLARM o=8 a=848 tgt=runtime.duffcopy<0>
Let's say that the distance between ABC (start address of
runtime.duffcopy) and XYZ (start of MyFunc) is just over the
architected 24-bit maximum displacement for an R_CALLARM (let's say
that ABC-XYZ is just over the architected limit by some small value,
say 36). Because we're calling into runtime.duffcopy at offset 848,
however, the relocation does in fact fit, but if the external linker
isn't taking into account the addend (assuming that all calls target
the first instruction of the called routine), then we'll get a
"doesn't fit" error from the linker.
To work around this problem, revise the ARM trampoline generation code
in the Go linker that computes the trampoline threshold to ignore the
addend on R_CALLARM relocations, so as to harmonize the two linkers.
Updates #58428.
Updates #58425.
Change-Id: I56e580c05b7b47bbe8edf5532a1770bbd700fbe5
Reviewed-on: https://go-review.googlesource.com/c/go/+/469275
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
v.SetZero() is faster than v.Set(reflect.Zero(v.Type()))
and was recently added in Go 1.20.
Benchmark numbers are largely unchanged since this mainly
affects the unmarshaling of large numbers of JSON nulls,
which our benchmarks do not heavily exercise.
Change-Id: I464f60f63c9027e63a99fd5da92e7ab782018329
Reviewed-on: https://go-review.googlesource.com/c/go/+/471195
Reviewed-by: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
With native support for fuzzing in the Go toolchain,
rely instead on the fuzz tests declared in fuzz_test.go.
Change-Id: I601842cd0bc7e64ea3bfdafbbbc3534df11acf59
Reviewed-on: https://go-review.googlesource.com/c/go/+/471197
Reviewed-by: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
TryBot-Result: Gopher Robot <gobot@golang.org>
Change-Id: Ie9c05210390dae43faf566907839bce953925735
Reviewed-on: https://go-review.googlesource.com/c/go/+/471258
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
The external linker will need to support the new PC relative relocations
when they are generated by Go in a future patch. If it does not, many
unhelpful relocation errors will be generated by the external linker.
Use the -mcpu=power10 option as a surrogate for -mpcrel. It most cases,
it should indicate whether the underlying linker has support for
resolving PC relative relocations.
Change-Id: I84b151ce04512ccaeb17835aaf44105a5f6b515b
Reviewed-on: https://go-review.googlesource.com/c/go/+/469576
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
'RedirectEdges' may range over an out-edge slice under modification, leading to out-of-index
panic, and reuse an IREdge object by mistake if there are multiple inlining call-sites.
Fix by rewriting part of the redirecting operation.
Remove 'redirectEdges' as it's not used now and not working as expected in case of multiple
inlining call-sites.
Fixes#58437.
Change-Id: Ic344d4c262df548529acdc9380636cb50835ca51
Reviewed-on: https://go-review.googlesource.com/c/go/+/466915
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
The function was added since go1.17, which is the minimum version for
bootstraping now.
Change-Id: I08b55c3639bb9ff042aabfcdcfbdf2993032ba6b
Reviewed-on: https://go-review.googlesource.com/c/go/+/471436
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
The Grow method is generally a more efficient way to grow a slice.
The older approach of using reflect.MakeSlice has to
waste effort zeroing the elements overwritten by the older slice
and has to allocate the slice header on the heap.
Performance:
name old time/op new time/op delta
CodeDecoder 2.41ms ± 2% 2.42ms ± 2% ~
CodeUnmarshal 3.12ms ± 3% 3.13ms ± 3% ~
CodeUnmarshalReuse 2.49ms ± 3% 2.52ms ± 3% ~
name old alloc/op new alloc/op delta
CodeDecoder 2.00MB ± 1% 1.99MB ± 1% ~
CodeUnmarshal 3.05MB ± 0% 2.92MB ± 0% -4.23%
CodeUnmarshalReuse 1.68MB ± 0% 1.68MB ± 0% -0.32%
name old allocs/op new allocs/op delta
CodeDecoder 77.1k ± 0% 77.0k ± 0% -0.09%
CodeUnmarshal 92.7k ± 0% 91.3k ± 0% -1.47%
CodeUnmarshalReuse 77.1k ± 0% 77.0k ± 0% -0.07%
The Code benchmarks (which are the only ones that uses slices)
are largely unaffected. There is a slight reduction in allocations.
A histogram of slice lengths from the Code testdata is as follows:
≤1: 392
≤2: 256
≤4: 252
≤8: 152
≤16: 126
≤32: 78
≤64: 62
≤128: 46
≤256: 18
≤512: 10
≤1024: 8
A bulk majority of slice lengths are 8 elements or under.
Use of reflect.Value.Grow performs better for larger slices since
it can avoid the zeroing of memory and has a faster growth rate.
However, Grow grows starting from 1 element,
with a 2x growth rate until some threshold (currently 512),
Starting from 1 ensures better utilization of the heap,
but at the cost of more frequent regrowth early on.
In comparison, the previous logic always started
with a minimum of 4 elements, which leads to a wasted capacity
of 75% for the highly frequent case of a single element slice.
The older code always had a growth rate of 1.5x,
and so wastes less memory for number of elements below 512.
All in all, there are too many factors that hurt or help performance.
Rergardless, the simplicity of favoring reflect.Value.Grow
over manually managing growth rates is a welcome simplification.
Change-Id: I62868a7f112ece3c2da3b4f6bdf74d397110243c
Reviewed-on: https://go-review.googlesource.com/c/go/+/471175
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Fixes#58745
Change-Id: Id6666477b2c25f081d6f86047cea12bf8b3cb679
Reviewed-on: https://go-review.googlesource.com/c/go/+/471495
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Andy Pan <panjf2000@gmail.com>
CL 461275 uses testing.Short to skip this kind of tests. But it may lead
to false positive, because testing.Short may not always set. For
example, the normal workflow when testing changes in net package is
running:
go test -v net
in local machine, that will cause the test failed.
Using testenv.Builder is better, since when it's the standard way to
check whether the test is running on builder or local machine.
Change-Id: Ia5347eb76b4f0415dde8fa3d6c89bd0105f15aa7
Reviewed-on: https://go-review.googlesource.com/c/go/+/471437
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Fixes#58622
Change-Id: Ibb80296c39614478c75cb6bb04b6d0695cb990d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/469795
Run-TryBot: Andy Pan <panjf2000@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Preparation for next CL.
Change-Id: I5ef170a04577d8aea10255e304357bdbea4935a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/470919
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
The original comment examples didn't pass the correct number
of function arguments. Rather than fixing that, use a simpler
example and adjust prose a bit.
Change-Id: I2806737a2b8f9c4b876911b214f3d9e28213fc27
Reviewed-on: https://go-review.googlesource.com/c/go/+/470918
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
We started emitting this segment in 2012 in CL 6326054 for #47.
It disabled three kinds of protection: mprotect, randexec, and emutramp.
The randexec protection was deprecated some time ago, replaced by PIE.
The emutramp and mprotect protection was because we used to rely on being
able to create writable executable memory to implement function closures,
but that is not true since https://go.dev/s/go11func was implemented.
Change-Id: I5e3a5279d76d642b0423d26195b891479a235763
Reviewed-on: https://go-review.googlesource.com/c/go/+/471199
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Unlike the rest of nistec, the P-256 assembly doesn't use complete
addition formulas, meaning that p256PointAdd[Affine]Asm won't return the
correct value if the two inputs are equal.
This was (undocumentedly) ignored in the scalar multiplication loops
because as long as the input point is not the identity and the scalar is
lower than the order of the group, the addition inputs can't be the same.
As part of the math/big rewrite, we went however from always reducing
the scalar to only checking its length, under the incorrect assumption
that the scalar multiplication loop didn't require reduction.
Added a reduction, and while at it added it in P256OrdInverse, too, to
enforce a universal reduction invariant on p256OrdElement values.
Note that if the input point is the infinity, the code currently still
relies on undefined behavior, but that's easily tested to behave
acceptably, and will be addressed in a future CL.
Fixes#58647
Fixes CVE-2023-24532
(Filed with the "safe APIs like complete addition formulas are good" dept.)
Change-Id: I7b2c75238440e6852be2710fad66ff1fdc4e2b24
Reviewed-on: https://go-review.googlesource.com/c/go/+/471255
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
As part of the effort to rely less on bytes.Buffer,
switch most operations to use more natural append-like operations.
This makes it easier to swap bytes.Buffer out with a buffer type
that only needs to support a minimal subset of operations.
As a simplification, we can remove the use of the scratch buffer
and use the available capacity of the buffer itself as the scratch.
Also, declare an inlineable mayAppendQuote function to conditionally
append a double-quote if necessary.
Performance:
name old time/op new time/op delta
CodeEncoder 405µs ± 2% 397µs ± 2% -1.94% (p=0.000 n=20+20)
CodeEncoderError 453µs ± 1% 444µs ± 4% -1.83% (p=0.000 n=19+19)
CodeMarshal 559µs ± 4% 548µs ± 2% -2.02% (p=0.001 n=19+17)
CodeMarshalError 724µs ± 3% 716µs ± 2% -1.13% (p=0.030 n=19+20)
EncodeMarshaler 24.9ns ±15% 22.9ns ± 5% ~ (p=0.086 n=20+17)
EncoderEncode 14.0ns ±27% 15.0ns ±20% ~ (p=0.365 n=20+20)
There is a slight performance gain across the board due to
the elimination of the scratch buffer. Appends are done directly
into the unused capacity of the underlying buffer,
avoiding an additional copy. See #53685
Updates #27735
Change-Id: Icf6d612a7f7a51ecd10097af092762dd1225d49e
Reviewed-on: https://go-review.googlesource.com/c/go/+/469558
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
This is part of the effort to reduce direct reliance on bytes.Buffer
so that we can use a buffer with better pooling characteristics.
Unify these two methods as a single version that uses generics
to reduce duplicated logic. Unfortunately, we lack a generic
version of utf8.DecodeRune (see #56948), so we cast []byte to string.
The []byte variant is slightly slower for multi-byte unicode since
casting results in a stack-allocated copy operation.
Fortunately, this code path is used only for TextMarshalers.
We can also delete TestStringBytes, which exists to ensure
that the two duplicate implementations remain in sync.
Performance:
name old time/op new time/op delta
CodeEncoder 399µs ± 2% 409µs ± 2% +2.59% (p=0.000 n=9+9)
CodeEncoderError 450µs ± 1% 451µs ± 2% ~ (p=0.684 n=10+10)
CodeMarshal 553µs ± 2% 562µs ± 3% ~ (p=0.075 n=10+10)
CodeMarshalError 733µs ± 3% 737µs ± 2% ~ (p=0.400 n=9+10)
EncodeMarshaler 24.9ns ±12% 24.1ns ±13% ~ (p=0.190 n=10+10)
EncoderEncode 12.3ns ± 3% 14.7ns ±20% ~ (p=0.315 n=8+10)
name old speed new speed delta
CodeEncoder 4.87GB/s ± 2% 4.74GB/s ± 2% -2.53% (p=0.000 n=9+9)
CodeEncoderError 4.31GB/s ± 1% 4.30GB/s ± 2% ~ (p=0.684 n=10+10)
CodeMarshal 3.51GB/s ± 2% 3.46GB/s ± 3% ~ (p=0.075 n=10+10)
CodeMarshalError 2.65GB/s ± 3% 2.63GB/s ± 2% ~ (p=0.400 n=9+10)
name old alloc/op new alloc/op delta
CodeEncoder 327B ±347% 447B ±232% +36.93% (p=0.034 n=9+10)
CodeEncoderError 142B ± 1% 143B ± 0% ~ (p=1.000 n=8+7)
CodeMarshal 1.96MB ± 2% 1.96MB ± 2% ~ (p=0.468 n=10+10)
CodeMarshalError 2.04MB ± 3% 2.03MB ± 1% ~ (p=0.971 n=10+10)
EncodeMarshaler 4.00B ± 0% 4.00B ± 0% ~ (all equal)
EncoderEncode 0.00B 0.00B ~ (all equal)
name old allocs/op new allocs/op delta
CodeEncoder 0.00 0.00 ~ (all equal)
CodeEncoderError 4.00 ± 0% 4.00 ± 0% ~ (all equal)
CodeMarshal 1.00 ± 0% 1.00 ± 0% ~ (all equal)
CodeMarshalError 6.00 ± 0% 6.00 ± 0% ~ (all equal)
EncodeMarshaler 1.00 ± 0% 1.00 ± 0% ~ (all equal)
EncoderEncode 0.00 0.00 ~ (all equal)
There is a very slight performance degradation for CodeEncoder
due to an increase in allocation sizes. However, the number of allocations
did not change. This is likely due to remote effects of the growth rate
differences between bytes.Buffer and the builtin append function.
We shouldn't overly rely on the growth rate of bytes.Buffer anyways
since that is subject to possibly change in #51462.
As the benchtime increases, the alloc/op goes down indicating
that the amortized memory cost is fixed.
Updates #27735
Change-Id: Ie35e480e292fe082d7986e0a4d81212c1d4202b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/469556
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
This is part of the effort to reduce direct reliance on bytes.Buffer
so that we can use a buffer with better pooling characteristics.
Avoid direct use of bytes.Buffer in Compact and Indent and
instead modify the logic to rely only on append.
This avoids reliance on the bytes.Buffer.Truncate method,
which makes switching to a custom buffer implementation easier.
Performance:
name old time/op new time/op delta
EncodeMarshaler 25.5ns ± 8% 25.7ns ± 9% ~ (p=0.724 n=10+10)
name old alloc/op new alloc/op delta
EncodeMarshaler 4.00B ± 0% 4.00B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
EncodeMarshaler 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Updates #27735
Change-Id: I8cded03fab7651d43b5a238ee721f3472530868e
Reviewed-on: https://go-review.googlesource.com/c/go/+/469555
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Bryan Mills <bcmills@google.com>
Factor out and simplify code that generates the addition of a 12 bit immediate
(the addition of a negative value is still handled via subtraction). This also
fixes the mishandling of the case where v is 0.
Change-Id: I6040f33d2fec87b772272531b3bf02390ae7f200
Reviewed-on: https://go-review.googlesource.com/c/go/+/461141
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Eric Fang <eric.fang@arm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This CL marks some netbsd assembly functions as NOFRAME to avoid
relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions
without stack were also marked as NOFRAME.
While here, and thanks to CL 466355, `asm_netbsd_amd64.s` can
be deleted in favor of `asm9_unix2_amd64.s`, which makes better
use of the frame pointer.
Updates #58378
Change-Id: Iff554b664ec25f2bb6ec198c0f684590b359c383
Reviewed-on: https://go-review.googlesource.com/c/go/+/466396
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Checking if v == nil is unnecessary, nil map always return the zero value of the value type.
Change-Id: I9c5499bc7db72c4c62e02013ba7f9a6ee4795c09
GitHub-Last-Rev: 03fc2330e2
GitHub-Pull-Request: golang/go#58662
Reviewed-on: https://go-review.googlesource.com/c/go/+/470736
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
This CL marks some freebsd assembly functions as NOFRAME to avoid
relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions
without stack were also marked as NOFRAME.
Updates #58378
Change-Id: Ibd00748946f1137e165293df7da73278cb673bbd
Reviewed-on: https://go-review.googlesource.com/c/go/+/466395
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Have the write barrier call return a pointer to a buffer into which
the generated code records pointers that need write barrier treatment.
Change-Id: I7871764298e0aa1513de417010c8d46b296b199e
Reviewed-on: https://go-review.googlesource.com/c/go/+/447781
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Bypass: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This signature uses the wrong type for the passed function, which
will be saved in the internal runtime map. Since the functions are
likely compatible (uint64 return versus int64), this may work but
should generally be fixed.
This is other instance of #58440.
Change-Id: Ied82e554745ef72eefeb5be540605809ffa06533
Reviewed-on: https://go-review.googlesource.com/c/go/+/470915
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
This reduces unbounded shift latency by one cycle, and may
generate less instructions in some cases.
When there is a choice whether to use doubleword or word shifts, use
doubleword shifts. Doubleword shifts have fewer hardware scheduling
restrictions across P8/P9/P10.
Likewise, rework the shift sequence to allow the compare/shift/overshift
values to compute in parallel, then choose the correct value.
Some ANDCCconst rules also need reworked to ensure they simplify when
used for their flag value. This commonly occurs when prove fails to
identify a bounded shift (e.g foo32<<uint(x&31)).
Change-Id: Ifc6ff4a865d68675e57745056db414b0eb6f2d34
Reviewed-on: https://go-review.googlesource.com/c/go/+/442597
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
insertInlCall mistakenly uses the absolute line number of the call
rather than the relative line number (adjusted by //line). Switch to the
correct line number.
The call filename was already correct.
Fixes#58648
Change-Id: Id8d1848895233e972d8cfe9c5789a88e62d06556
Reviewed-on: https://go-review.googlesource.com/c/go/+/470876
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
A later CL will add additional test cases for CallFile and CallLine with
a //line directive. The parameter/variable checks have nothing to do
with line numbers and will only serve to make the test more difficult to
follow, so split this single mega-test into two: one for testing
file/line and the other for testing parameters/variables.
There are a few additional minor changes:
1. A missing AttrName is now an error.
2. Check added for AttrCallLine, which was previously untested.
For #58648.
Change-Id: I35e75ead766bcf910c58eb20541769349841f2b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/470875
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
If you run make.bash on an arm system without GOARM set,
we sniff the local system to find the maximum default GOARM
that will actually work on that system. That's fine, and we can
keep doing that.
But the story for cross-compiling is weirder.
If we build a windows/amd64 toolchain and then use it to
cross-compile linux/arm binaries, we get GOARM=7 binaries.
Do the same on a linux/amd64 system and you get GOARM=5 binaries.
This clearly makes no sense, and worse it makes the builds
non-reproducible in a subtle way.
This CL simplifies the logic and improves reproducibility by
defaulting to GOARM=7 any time we wouldn't sniff the local system.
On go.dev/dl we serve a linux-armv6l distribution with a default GOARM=6.
That is built by setting GOARM=6 during make.bash, so it is unaffected
by this CL and will continue to be GOARM=6.
For #24904.
Change-Id: I4331709876d5948fd33ec6e4a7b18b3cef12f240
Reviewed-on: https://go-review.googlesource.com/c/go/+/470695
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>