A Directive (like <!ENTITY xxx []>) can't have other nodes nested inside
it (in our data structure representation), so there is no way to
preserve comments. The previous behavior was to just elide them, which
however might change the semantic meaning of the surrounding markup.
Instead, replace them with a space which hopefully has the same semantic
effect of the comment.
Directives are not actually a node type in the XML spec, which instead
specifies each of them separately (<!ENTITY, <!DOCTYPE, etc.), each with
its own grammar. The rules for where and when the comments are allowed
are not straightforward, and can't be implemented without implementing
custom logic for each of the directives.
Simply preserving the comments in the body of the directive would be
problematic, as there can be unmatched quotes inside the comment.
Whether those quotes are considered meaningful semantically or not,
other parsers might disagree and interpret the output differently.
This issue was reported by Juho Nurminen of Mattermost as it leads to
round-trip mismatches. See #43168. It's not being fixed in a security
release because round-trip stability is not a currently supported
security property of encoding/xml, and we don't believe these fixes
would be sufficient to reliably guarantee it in the future.
Fixes CVE-2020-29510
Updates #43168
Change-Id: Icd86c75beff3e1e0689543efebdad10ed5178ce3
Reviewed-on: https://go-review.googlesource.com/c/go/+/277893
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Before this change, <:name> would parse as <name>, which could cause
issues in applications that rely on the parse-encode cycle to
round-trip. Similarly, <x name:=""> would parse as expected but then
have the attribute dropped when serializing because its name was empty.
Finally, <a🅱️c> would parse and get serialized incorrectly. All these
values are invalid XML, but to minimize the impact of this change, we
parse them whole into Name.Local.
This issue was reported by Juho Nurminen of Mattermost as it leads to
round-trip mismatches. See #43168. It's not being fixed in a security
release because round-trip stability is not a currently supported
security property of encoding/xml, and we don't believe these fixes
would be sufficient to reliably guarantee it in the future.
Fixes CVE-2020-29509
Fixes CVE-2020-29511
Updates #43168
Change-Id: I68321c4d867305046f664347192948a889af3c7f
Reviewed-on: https://go-review.googlesource.com/c/go/+/277892
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Prior to this CL, pasting the example from the website causes a
compilation error for some programs because it was shadowing the
"json" package.
Change-Id: I39b68a66ca99468547f2027a7655cf1387b61e95
Reviewed-on: https://go-review.googlesource.com/c/go/+/301492
Reviewed-by: Joe Tsai <joetsai@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Joe Tsai <joetsai@google.com>
Run-TryBot: Joe Tsai <joetsai@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Previously re-using a decoder with a new stream resulted in a confusing
"extra data in buffer" error message.
Change-Id: Ia4c4c3a2d4b63c59e37e53faa61a500d5ff6e5f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/297949
Reviewed-by: Rob Pike <r@golang.org>
Run-TryBot: Rob Pike <r@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
This change properly handles a TokenReader which
returns an EOF in the middle of an open XML
element.
Thanks to Sam Whited for reporting this.
Fixes CVE-2021-27918
Fixes#44913
Change-Id: Id02a3f3def4a1b415fa2d9a8e3b373eb6cb0f433
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/1004594
Reviewed-by: Russ Cox <rsc@google.com>
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Filippo Valsorda <valsorda@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/300391
Trust: Katie Hockman <katie@golang.org>
Run-TryBot: Katie Hockman <katie@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Limits the number of bytes that can be consumed by Uvarint
to MaxVarintLen64 (10) to avoid wasted computations.
With this change, if Uvarint reads more than MaxVarintLen64
bytes, it'll return the erroring byte count of n=-(MaxVarintLen64+1)
which is -11, as per the function signature.
Updated some tests to reflect the new change in expectations of n
when the number of bytes to be read exceeds the limits..
Fixes#41185
Change-Id: Ie346457b1ddb0214b60c72e81128e24d604d083d
Reviewed-on: https://go-review.googlesource.com/c/go/+/299531
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Map serialization using reflect.Value.MapIndex cannot retrieve
map keys that contain a NaN, resulting in a panic.
Switch the implementation to use the reflect.Value.MapRange method
instead, which iterates over all map entries regardless of whether
they are directly retrievable.
Note that according to RFC 8259, section 4, a JSON object should
have unique names, but does not forbid the occurrence of duplicate names.
Fixes#43207
Change-Id: If4bc55229b1f64b8ca4b0fed37549725efdace39
Reviewed-on: https://go-review.googlesource.com/c/go/+/278632
Trust: Meng Zhuo <mzh@golangcn.org>
Trust: Joe Tsai <thebrokentoaster@gmail.com>
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
The IsExported method is a more intuitive helper for checking whether
the method or field is exported than checking whether PkgPath is empty.
In the same CL, modify the standard library to make use of this helper.
Fixes#41563
Change-Id: Iaacfb3b74449501f98e2707aa32095a32bd3c3c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/266197
Trust: Joe Tsai <joetsai@digital-static.net>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The decodeState type is a large part of the allocated space during Unmarshal.
The errorContext field is infrequently used, and only on error.
Extract it into a pointer and allocate it separate when necessary.
name old time/op new time/op delta
UnmarshalString-8 115ns ± 5% 114ns ± 3% ~ (p=0.170 n=15+15)
UnmarshalFloat64-8 113ns ± 2% 106ns ± 1% -6.42% (p=0.000 n=15+14)
UnmarshalInt64-8 93.3ns ± 1% 86.9ns ± 4% -6.89% (p=0.000 n=14+15)
name old alloc/op new alloc/op delta
UnmarshalString-8 192B ± 0% 160B ± 0% -16.67% (p=0.000 n=15+15)
UnmarshalFloat64-8 180B ± 0% 148B ± 0% -17.78% (p=0.000 n=15+15)
UnmarshalInt64-8 176B ± 0% 144B ± 0% -18.18% (p=0.000 n=15+15)
name old allocs/op new allocs/op delta
UnmarshalString-8 2.00 ± 0% 2.00 ± 0% ~ (all equal)
UnmarshalFloat64-8 2.00 ± 0% 2.00 ± 0% ~ (all equal)
UnmarshalInt64-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Change-Id: I53f3f468e6c65f77a12e5138a2626455b197012d
Reviewed-on: https://go-review.googlesource.com/c/go/+/271338
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Trust: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild
Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This reverts commit 6af088bfc6.
Reason for revert: Broke many tests inside Google which implies many
tests were broken outside of Google as well. The tests may be brittle
but still would require work to change and it's not clear it's worth
the benefit.
Updates #36221Fixes#42675
Change-Id: Id3a14eb37e7119f5abe50e80dfbf120fdc44db72
Reviewed-on: https://go-review.googlesource.com/c/go/+/273747
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Trust: Joe Tsai <thebrokentoaster@gmail.com>
Fixes the check for the reserved namespace prefix
"xml" to be case insensitive, so as to match all variants of:
(('X'|'x')('M'|'m')('L'|'l'))
as mandated by Section 2.3 of https://www.w3.org/TR/REC-xml/
This is a roll forward of CL 203417, which was rolled back by CL 240179.
We've decided that the roll back was incorrect, and any broken tests
should be fixed.
The original CL 203417 was by Tamás Gulácsi.
Fixes#35151
For #39876
Change-Id: I2e6daa7aeb252531fba0b8a56086613e13059528
Reviewed-on: https://go-review.googlesource.com/c/go/+/264024
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
The other named errors - UnmarshalTypeError, etc - in this package do
the same, so we should prepend the package prefix to error messages
for consistency.
Add a note to the release docs in case this is interpreted as
a breaking change.
Fixes#36221.
Change-Id: Ie24b532bbf9812e108c259fa377e2a6b64319ed4
Reviewed-on: https://go-review.googlesource.com/c/go/+/263619
Run-TryBot: Kevin Burke <kev@inburke.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Kevin Burke <kev@inburke.com>
Trust: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
The old ioutil references are still valid, but update our code
to reflect best practices and get used to the new locations.
Code compiled with the bootstrap toolchain
(cmd/asm, cmd/dist, cmd/compile, debug/elf)
must remain Go 1.4-compatible and is excluded.
Also excluded vendored code.
For #41190.
Change-Id: I6d86f2bf7bc37a9d904b6cee3fe0c7af6d94d5b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/263142
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
This change clarifies the usage of the SET type name suffix. Previously
the documentation was somewhat confusing about where the suffix should
be used, and when used what it applied to. For instance the previous
language could be interpreted such that []exampleSET would be parsed as
a SEQUENCE OF SET, which is incorrect as the SET suffix only applies to
slice types, such as type exampleSET []struct{} which is parsed as a
SET OF SEQUENCE.
Change-Id: I74201d9969f931f69391c236559f66cb460569ec
GitHub-Last-Rev: d0d2ddc587
GitHub-Pull-Request: golang/go#38543
Reviewed-on: https://go-review.googlesource.com/c/go/+/229078
Trust: Roland Shoemaker <roland@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Changes Unmarshal to return an error, instead of
panicking when its value is nil or not a pointer.
This change matches the behavior of other encoding
packages like json.
Fixes#41509.
Change-Id: I92c3af3a960144566e4c2b55d00c3a6fe477c8d5
GitHub-Last-Rev: c668b6e4ad
GitHub-Pull-Request: golang/go#41485
Reviewed-on: https://go-review.googlesource.com/c/go/+/255881
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Trust: Emmanuel Odeke <emm.odeke@gmail.com>
Now reports an error if cyclic maps and slices are to be encoded
instead of an infinite recursion. This case wasn't handled in CL 187920.
Fixes#40745.
Change-Id: Ia34b014ecbb71fd2663bb065ba5355a307dbcc15
GitHub-Last-Rev: 6f874944f4
GitHub-Pull-Request: golang/go#40756
Reviewed-on: https://go-review.googlesource.com/c/go/+/248358
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Trust: Daniel Martí <mvdan@mvdan.cc>
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Go Bot <gobot@golang.org>
Allow ';' as a valid character for json field keys and struct tags.
Fixes#39189
Change-Id: I4b602a1b0674ff028db07623682f0d1e8e9fd6c9
Reviewed-on: https://go-review.googlesource.com/c/go/+/234818
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Giovanni Bajo <rasky@develer.com>
Trust: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
This reverts CL 253037.
Reason for revert: The recommended way to check for a type of error is errors.As. API changes should also start with a proposal.
Change-Id: I62896717aa47ed491c2c4775d2b05d80e5e9cde3
Reviewed-on: https://go-review.googlesource.com/c/go/+/254837
Trust: Damien Neil <dneil@google.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
This reverts CL 254537.
Reason for revert: Reason for revert: The recommended way to check for a type of error is errors.As. API changes should also start with a proposal.
Change-Id: I07c37428575e99c80b17525833a61831d10963bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/254857
Trust: Damien Neil <dneil@google.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Allows users to check:
errors.Is(err, &UnmarshalTypeError{})
errors.Is(err, &UnmarshalFieldError{})
errors.Is(err, &InvalidUnmarshalError{})
errors.Is(err, &UnsupportedValueError{})
errors.Is(err, &MarshalerError{})
which is the recommended way of checking for kinds of errors.
SyntaxError.Is was implemented in CL 253037.
As and Unwrap relevant methods will be added in future CLs.
Change-Id: I1f8a503b8fdc0f3afdfe9669a91f3af8d960e028
GitHub-Last-Rev: 930cda5384
GitHub-Pull-Request: golang/go#41360
Reviewed-on: https://go-review.googlesource.com/c/go/+/254537
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Trust: Emmanuel Odeke <emm.odeke@gmail.com>
Allows users to check:
errors.Is(err, &json.SyntaxError{})
which is the recommended way of checking for kinds of errors.
Change-Id: I20dc805f20212765e9936a82d9cb7822e73ec4ef
GitHub-Last-Rev: e2627ccf8e
GitHub-Pull-Request: golang/go#41210
Reviewed-on: https://go-review.googlesource.com/c/go/+/253037
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Replace strings.Split by strings.IndexByte and explicit
slicing to avoid the allocation of the return slice
of strings.Split.
name old time/op new time/op delta
Marshal 43.3µs ± 1% 36.7µs ± 1% -15.23% (p=0.000 n=9+9)
name old alloc/op new alloc/op delta
Marshal 10.7kB ± 0% 9.2kB ± 0% -13.96% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Marshal 444 ± 0% 366 ± 0% -17.57% (p=0.000 n=10+10)
Change-Id: I9e727defa23f7e5fc684f246de0136fe28cf8d25
Reviewed-on: https://go-review.googlesource.com/c/go/+/231738
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This CL ensures that ReadUvarint consumes only a limited
amount of input (instead of an unbounded amount).
On some inputs, ReadUvarint could read an arbitrary number
of bytes before deciding to return an overflow error.
After this CL, ReadUvarint returns that same overflow
error sooner, after reading at most MaxVarintLen64 bytes.
Fix authored by Robert Griesemer and Filippo Valsorda.
Thanks to Diederik Loerakker, Jonny Rhea, Raúl Kripalani,
and Preston Van Loon for reporting this.
Fixes#40618
Fixes CVE-2020-16845
Change-Id: Ie0cb15972f14c38b7cf7af84c45c4ce54909bb8f
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/812099
Reviewed-by: Filippo Valsorda <valsorda@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/247120
Run-TryBot: Katie Hockman <katie@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
This reverts https://golang.org/cl/191783.
Reason for revert: Broke too many programs which depended on the previous
behavior, even when it was the opposite of what the documentation said.
We can attempt to fix the original issue again for 1.16, while keeping
those programs in mind.
Fixes#39427.
Change-Id: I7a7f24b2a594c597ef625aeff04fff29aaa88fc6
Reviewed-on: https://go-review.googlesource.com/c/go/+/240657
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This reverts CL 203417.
Reason for revert: This change changes uses of tags like "XMLSchema-instance" without any recourse.
For #35151Fixes#39876
Change-Id: I4c85c8267a46b3748664b5078794dafffb42aa26
Reviewed-on: https://go-review.googlesource.com/c/go/+/240179
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
This reverts golang.org/cl/190659 and golang.org/cl/226218, minus the
regression tests in the latter.
The original work happened in golang.org/cl/151157, which was reverted
in golang.org/cl/190909 due to a crash found by fuzzing.
We tried a second time in golang.org/cl/190659, which shipped with Go
1.14. A bug was found, where strings would be mangled in certain edge
cases. The fix for that was golang.org/cl/226218, which was backported
into Go 1.14.4.
Unfortunately, a second regression was just reported in #39555, which is
a similar case of strings getting mangled when decoding under certain
conditions. It would be possible to come up with another small patch to
fix that edge case, but instead, let's just revert the entire
optimization, as it has proved to do more harm than good. Moreover, it's
hard to argue or prove that there will be no more such regressions.
However, all the work wasn't for nothing. First, we learned that the way
the decoder unquotes tokenized strings isn't simple; initially, we had
wrongly assumed that each string was unquoted exactly once and in order.
Second, we have gained a number of regression tests which will be useful
to prevent the same mistakes in the future, including the test cases we
add in this CL.
Fixes#39555.
Change-Id: I66a6919c2dd6d9789232482ba6cf3814eaa70f61
Reviewed-on: https://go-review.googlesource.com/c/go/+/237838
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
fieldInfo.value used to initialize nil anonymous struct fields if they
were encountered. This behavior is wanted when decoding, but not when
encoding. When encoding, the value should never be modified, and these
nil fields should be skipped entirely.
To fix the bug, add a bool argument to the function which tells the
code whether we are encoding or decoding.
Finally, add a couple of tests to cover the edge cases pointed out in
the original issue.
Fixes#27240.
Change-Id: Ic97ae4bfe5f2062c8518e03d1dec07c3875e18f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/196809
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
This reverts golang.org/cl/179337.
Reason for revert: broke a few too many reasonably valid Go programs.
The previous behavior was perhaps less consistent, but the docs were
never very clear about when the decoder merges with existing values,
versus replacing existing values altogether.
Fixes#39149.
Change-Id: I1c1d857709b8398969fe421aa962f6b62f91763a
Reviewed-on: https://go-review.googlesource.com/c/go/+/234559
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
Specifically, this change documents the behavior of Unmarshal when a
SEQUENCE contains trailing elements.
For context Unmarshal treats trailing elements of a SEQUENCE that do not
have matching struct fields as valid, as this is how ASN.1 structures
are typically extended. This can be somewhat confusing as you might
expect those elements to be appended to rest, but rest is really only
for trailing data unrelated to the structure being parsed (i.e. if you
append a second sequence to b, it would be returned in rest).
Fixes#35680
Change-Id: Ia2c68b2f7d8674d09e859b4b7f9aff327da26fa0
Reviewed-on: https://go-review.googlesource.com/c/go/+/233537
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
When we decode into a struct, each input key-value may be decoded into
one of the struct's fields. Particularly, existing data isn't dropped,
so that some sub-fields can be decoded into without zeroing all other
data.
However, decoding into a map behaved in the opposite way. Whenever a
key-value was decoded, it completely replaced the previous map element.
If the map contained any non-zero data in that key, it's dropped.
Instead, try to reuse the existing element value if possible. If the map
element type is a pointer, and the value is non-nil, we can decode
directly into it. If it's not a pointer, make a copy and decode into
that copy, as map element values aren't addressable.
This means we have to parse and convert the map element key before the
value, to be able to obtain the existing element value. This is fine,
though. Moreover, reporting errors on the key before the value follows
the input order more closely.
Finally, add a test to explore the four combinations, involving pointer
and non-pointer, and non-zero and zero values. A table-driven test
wasn't used, as each case required different checks, such as checking
that the non-nil pointer case doesn't end up with a different pointer.
Fixes#31924.
Change-Id: I5ca40c9963a98aaf92f26f0b35843c021028dfca
Reviewed-on: https://go-review.googlesource.com/c/go/+/179337
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The added comment contains some context. The original optimization
assumed that each call to unquoteBytes (or unquote) followed its
corresponding call to rescanLiteral. Otherwise, unquoting a literal
might use d.safeUnquote from another re-scanned literal.
Unfortunately, this assumption is wrong. When decoding {"foo": "bar"}
into a map[T]string where T implements TextUnmarshaler, the sequence of
calls would be as follows:
1) rescanLiteral "foo"
2) unquoteBytes "foo"
3) rescanLiteral "bar"
4) unquoteBytes "foo" (for UnmarshalText)
5) unquoteBytes "bar"
Note that the call to UnmarshalText happens in literalStore, which
repeats the work to unquote the input string literal. But, since that
happens after we've re-scanned "bar", we're using the wrong safeUnquote
field value.
In the added test case, the second string had a non-zero number of safe
bytes, and the first string had none since it was all non-ASCII. Thus,
"safely" unquoting a number of the first string's bytes could cut a rune
in half, and thus mangle the runes.
A rather simple fix, without a full revert, is to only allow one use of
safeUnquote per call to unquoteBytes. Each call to rescanLiteral when
we have a string is soon followed by a call to unquoteBytes, so it's no
longer possible for us to use the wrong index.
Also add a test case from #38126, which is the same underlying bug, but
affecting the ",string" option.
Before the fix, the test would fail, just like in the original two issues:
--- FAIL: TestUnmarshalRescanLiteralMangledUnquote (0.00s)
decode_test.go:2443: Key "开源" does not exist in map: map[开���:12345开源]
decode_test.go:2458: Unmarshal unexpected error: json: invalid use of ,string struct tag, trying to unmarshal "\"aaa\tbbb\"" into string
Fixes#38105.
For #38126.
Change-Id: I761e54924e9a971a4f9eaa70bbf72014bb1476e6
Reviewed-on: https://go-review.googlesource.com/c/go/+/226218
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
golang.org/cl/193604 fixed one bug when one encodes a string with the
",string" option: if SetEscapeHTML(false) is used, we should not be
using HTML escaping for the inner string encoding. The CL correctly
fixed that.
The CL also tried to speed up this edge case. By avoiding an entire new
call to Marshal, the new Issue34127 benchmark reduced its time/op by
45%, and lowered the allocs/op from 3 to 2.
However, that last optimization wasn't correct:
Since Go 1.2 every string can be marshaled to JSON without error
even if it contains invalid UTF-8 byte sequences. Therefore
there is no need to use Marshal again for the only reason of
enclosing the string in double quotes.
JSON string encoding isn't just about adding quotes and taking care of
invalid UTF-8. We also need to escape some characters, like tabs and
newlines.
The new code failed to do that. The bug resulted in the added test case
failing to roundtrip properly; before our fix here, we'd see an error:
invalid use of ,string struct tag, trying to unmarshal "\"\b\f\n\r\t\"\\\"" into string
If you pay close attention, you'll notice that the special characters
like tab and newline are only encoded once, not twice. When decoding
with the ",string" option, the outer string decode works, but the inner
string decode fails, as we are now decoding a JSON string with unescaped
special characters.
The fix we apply here isn't to go back to Marshal, as that would
re-introduce the bug with SetEscapeHTML(false). Instead, we can use a
new encode state from the pool - it results in minimal performance
impact, and even reduces allocs/op further. The performance impact seems
fair, given that we need to check the entire string for characters that
need to be escaped.
name old time/op new time/op delta
Issue34127-8 89.7ns ± 2% 100.8ns ± 1% +12.27% (p=0.000 n=8+8)
name old alloc/op new alloc/op delta
Issue34127-8 40.0B ± 0% 32.0B ± 0% -20.00% (p=0.000 n=8+8)
name old allocs/op new allocs/op delta
Issue34127-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=8+8)
Instead of adding another standalone test, we convert an existing
"string tag" test to be table-based, and add another test case there.
One test case from the original CL also had to be amended, due to the
same problem - when escaping '<' due to SetEscapeHTML(true), we need to
end up with double escaping, since we're using ",string".
Fixes#38173.
Change-Id: I2b0df9e4f1d3452fff74fe910e189c930dde4b5b
Reviewed-on: https://go-review.googlesource.com/c/go/+/226498
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Per X690 Section 11.6 sort the order of SET of components when generating
DER. This CL makes no changes to Unmarshal, meaning unordered components
will still be accepted, and won't be re-ordered during parsing.
In order to sort the components a new encoder, setEncoder, which is similar
to multiEncoder is added. The functional difference is that setEncoder
encodes each component to a [][]byte, sorts the slice using a sort.Sort
interface, and then writes it out to the destination slice. The ordering
matches the output of OpenSSL.
Fixes#24254
Change-Id: Iff4560f0b8c2dce5aae616ba30226f39c10b972e
GitHub-Last-Rev: e52fc43658
GitHub-Pull-Request: golang/go#38228
Reviewed-on: https://go-review.googlesource.com/c/go/+/226984
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reject base 128 encoded integers that aren't using minimal encoding,
specifically if the leading octet of an encoded integer is 0x80. This
only affects parsing of tags and OIDs, both of which expect this
encoding (see X.690 8.1.2.4.2 and 8.19.2).
Fixes#36881
Change-Id: I969cf48ac1fba7e56bac334672806a0784d3e123
GitHub-Last-Rev: fefc03d202
GitHub-Pull-Request: golang/go#38281
Reviewed-on: https://go-review.googlesource.com/c/go/+/227320
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The previous behavior directly contradicted the docs that have been in
place for years:
To unmarshal a JSON array into a slice, Unmarshal resets the
slice length to zero and then appends each element to the slice.
We could use reflect.New to create a new element and reflect.Append to
then append it to the destination slice, but benchmarks have shown that
reflect.Append is very slow compared to the code that manually grows a
slice in this file.
Instead, if we're decoding into an element that came from the original
backing array, zero it before decoding into it. We're going to be using
the CodeDecoder benchmark, as it has a slice of struct pointers that's
decoded very often.
Note that we still reuse existing values from arrays being decoded into,
as the documentation agrees with the existing implementation in that
case:
To unmarshal a JSON array into a Go array, Unmarshal decodes
JSON array elements into corresponding Go array elements.
The numbers with the benchmark as-is might seem catastrophic, but that's
only because the benchmark is decoding into the same variable over and
over again. Since the old decoder was happy to reuse slice elements, it
would save a lot of allocations by not having to zero and re-allocate
said elements:
name old time/op new time/op delta
CodeDecoder-8 10.4ms ± 1% 10.9ms ± 1% +4.41% (p=0.000 n=10+10)
name old speed new speed delta
CodeDecoder-8 186MB/s ± 1% 178MB/s ± 1% -4.23% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
CodeDecoder-8 2.19MB ± 0% 3.59MB ± 0% +64.09% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
CodeDecoder-8 76.8k ± 0% 92.7k ± 0% +20.71% (p=0.000 n=10+10)
We can prove this by moving 'var r codeResponse' into the loop, so that
the benchmark no longer reuses the destination pointer. And sure enough,
we no longer see the slow-down caused by the extra allocations:
name old time/op new time/op delta
CodeDecoder-8 10.9ms ± 0% 10.9ms ± 1% -0.37% (p=0.043 n=10+10)
name old speed new speed delta
CodeDecoder-8 177MB/s ± 0% 178MB/s ± 1% +0.37% (p=0.041 n=10+10)
name old alloc/op new alloc/op delta
CodeDecoder-8 3.59MB ± 0% 3.59MB ± 0% ~ (p=0.780 n=10+10)
name old allocs/op new allocs/op delta
CodeDecoder-8 92.7k ± 0% 92.7k ± 0% ~ (all equal)
I believe that it's useful to leave the benchmarks as they are now,
because the decoder does reuse memory in some cases. For example,
existing map elements are reused. However, subtle changes like this one
need to be benchmarked carefully.
Finally, add a couple of tests involving both a slice and an array of
structs.
Fixes#21092.
Change-Id: I8b1194f25e723a31abd146fbfe9428ac10c1389d
Reviewed-on: https://go-review.googlesource.com/c/go/+/191783
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Fixes the check for the reserved namespace prefix
"xml" to be case insensitive, so as to match all variants of:
(('X'|'x')('M'|'m')('L'|'l'))
as mandated by Section 2.3 of https://www.w3.org/TR/REC-xml/Fixes#35151.
Change-Id: Id5a98e5f9d69d3741dc16f567c4320f1ad0b3c70
Reviewed-on: https://go-review.googlesource.com/c/go/+/203417
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: I1fd47e5eab27346cec488098d4f6102a0749bd28
Reviewed-on: https://go-review.googlesource.com/c/go/+/221788
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This CL changes some unit test functions, making sure that these tests (and goroutines spawned during test) won't block.
Since they are just test functions, I use one CL to fix them all. I hope this won't cause trouble to reviewers and can save time for us.
There are three main categories of incorrect logic fixed by this CL:
1. Use testing.Fatal()/Fatalf() in spawned goroutines, which is forbidden by Go's document.
2. Channels are used in such a way that, when errors or timeout happen, the test will be blocked and never return.
3. Channels are used in such a way that, when errors or timeout happen, the test can return but some spawned goroutines will be leaked, occupying resource until all other tests return and the process is killed.
Change-Id: I3df931ec380794a0cf1404e632c1dd57c65d63e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/219380
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Instead use string(r) where r has type rune.
This is in preparation for a vet warning for string(i).
Updates #32479
Change-Id: Ic205269bba1bd41723950219ecfb67ce17a7aa79
Reviewed-on: https://go-review.googlesource.com/c/go/+/220844
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Akhil Indurti <aindurti@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Toshihiro Shiino <shiino.toshihiro@gmail.com>