It's a little bit waste to check if r is not a surrogate
code point because RuneError is not a surrogate code point.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/79230043
These test cases are redundant because TestSimpleFold tests
all possible rotations of test data, so no need to add
rotated strings.
Also updated the comment as it's guaranteed that SimpleFold
returns values in increasing order.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/77730043
This is a relatively minor change.
This does not result in changes to go.text/unicode/norm. The go.text
packages will therefore be relatively unaffected. It does make the
way for an upgrade to CLDR 24, though.
The tests of all.bash pass, as well as the tests in go.text after
this update.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/65400044
If a LowerUpper ever happens, maketables will complain.
Fixes#7002.
LGTM=dave
R=golang-codereviews, dave
CC=golang-codereviews
https://golang.org/cl/59210044
The EncodeRune test exercises DecodeRune, but only for runes that it can encode. Add an explicit test for invalid utf16 surrogate pairs.
Bonus: coverage is now 100%
unicode/utf16/utf16.go: IsSurrogate 100.0%
unicode/utf16/utf16.go: DecodeRune 100.0%
unicode/utf16/utf16.go: EncodeRune 100.0%
unicode/utf16/utf16.go: Encode 100.0%
unicode/utf16/utf16.go: Decode 100.0%
total: (statements) 100.0%
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/39150044
The existing function, IsOneOf, is hard to use. Since the slice comes
before the rune, in parallelism with the other Is functions, the slice
is clumsy to build. This CL adds a nicer-signatured In function of
equivalent functionality (its implementation is identical) that's much
easier to use. Compare:
unicode.IsOneOf([]*unicode.RangeTable{unicode.Letter, unicode.Number}, r)
unicode.In(r, unicode.Letter, unicode.Number)
R=golang-dev, adg
CC=golang-dev
https://golang.org/cl/11672044
*** There is an API change here: the introduction of the
LatinOffset int in the RangeTable struct. ***
* Avoid checking Latin range multiple times for non-Latin runes.
* Use linear search when it is faster than binary search.
go test -calibrate runs the calibration for where the linear/binary
crossover should be.
benchmark old MB/s new MB/s speedup
BenchmarkFields 36.27 41.43 1.14x
BenchmarkFieldsFunc 36.23 41.38 1.14x
The speedup here is evenly split between the linear scans
and the LatinOffset change. Both are about 1.07x.
R=r
CC=bradfitz, gobot, golang-dev
https://golang.org/cl/6526048
This is required by the spec to produce the replacement char.
The fix lies in lib9's rune code.
R=golang-dev, nigeltao, rsc
CC=golang-dev
https://golang.org/cl/6443109
Surrogate halves are part of UTF-16 and should never appear in UTF-8.
(The rune that two combined halves represent in UTF-16 should
be encoded directly.)
Encoding: encode as RuneError.
Decoding: convert to RuneError, consume one byte.
This requires changing:
package unicode/utf8
runtime for range over string
Also added utf8.ValidRune and fixed bug in utf.RuneLen.
Fixes#3927.
R=golang-dev, rsc, bsiegert
CC=golang-dev
https://golang.org/cl/6458099
Surrogates are still admitted, but I have sent mail to golang-dev on that topic.
Fixes#3785.
R=golang-dev, rogpeppe, iant
CC=golang-dev
https://golang.org/cl/6398049
In both the web and command line tool,
the comment is shown after the declaration.
But in the code the comment is obviously before.
Make the text not refer to a specific order.
R=r, dsymonds
CC=golang-dev
https://golang.org/cl/6206094
In the test, verify the copied constants are correct.
Also put the test into package utf16 rather than utf16_test;
the old location was probably due creating the test from
utf8, but the separation is not needed here.
R=golang-dev, bradfitz, rsc, rsc, r
CC=golang-dev
https://golang.org/cl/5752047
The dependency was there only to pull in two constants.
Now we define them locally and verify equality in the test.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/5754046
* add -work option to save temporary files (Fixes issue 2980)
* fix go test -i to work with cgo packages (Fixes issue 2936)
* do not overwrite/remove empty directories or non-object
files during build (Fixes issue 2829)
* remove package main vs package non-main heuristic:
a directory must contain only one package (Fixes issue 2864)
* to make last item workable, ignore +build tags for files
named on command line: go build x.go builds x.go even
if it says // +build ignore.
* add // +build ignore tags to helper programs
R=golang-dev, r, r
CC=golang-dev
https://golang.org/cl/5674043
The comment on IsOneOf regarding Latin-1 was an implementation detail:
when the function is called internally, that condition is true. It used to matter,
but now the comment is a dreg. The function works fine if the character is
Latin-1, so we just delete the comment.
Fixes#2966.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/5655047
As a convenience to people working on the tools,
leave Makefiles that invoke the go dist tool appropriately.
They are not used during the build.
R=golang-dev, bradfitz, n13m3y3r, gustavo
CC=golang-dev
https://golang.org/cl/5636050
Consequently, remove many package Makefiles,
and shorten the few that remain.
gomake becomes 'go tool make'.
Turn off test phases of run.bash that do not work,
flagged with $BROKEN. Future CLs will restore these,
but this seemed like a big enough CL already.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5601057
All but 3 cases (in gcimporter.go and hixie.go)
are automatic conversions using gofix.
No attempt is made to use the new Append functions
even though there are definitely opportunities.
R=golang-dev, gri
CC=golang-dev
https://golang.org/cl/5447069
This contains the files that required handiwork, mostly
Makefiles with updated TARGs, plus the two packages
with modified package names.
html/template/doc.go needs a separate edit pass.
test/fixedbugs/bug358.go is not legal go so gofix fails on it.
R=rsc
CC=golang-dev
https://golang.org/cl/5340050
This is Go 1 package renaming CL #4.
This one merely moves the source; the import strings will be
changed after the next weekly release.
This one moves pieces into os, text, and unicode.
exec -> os/exec
scanner -> text/scanner
tabwriter -> text/tabwriter
template -> text/template
template/parse -> text/template/parse
utf16 -> unicode/utf16
utf8 -> unicode/utf8
This should be the last of the source-rearranging CLs.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5331066
Makes tables.go output consistent across maketable runs.
(It was already inconsistent across architectures; the new
map iteration order just make it inconsistent across runs.)
R=r
CC=golang-dev
https://golang.org/cl/5303046
Hurray!
Also fix the mystical U+0345 COMBINING GREEK YPOGEGRAMMENI,
so everyone is satisfied.
Also add a -local flag to use local files for faster turnaround
when debugging.
R=rsc
CC=golang-dev
https://golang.org/cl/4825054