the need to decompose characters for the majority of cases. This considerably
speeds up collation while increasing the table size minimally.
To detect non-normalized strings, rather than relying on exp/norm, the table
now includes CCC information. The inclusion of this information does not
increase table size.
DETAILS
- Raw collation elements are now a struct that includes the CCC, rather
than a slice of ints.
- Builder now ensures that NFD and NFC counterparts are included in the table.
This also fixes a bug for Korean which is responsible for most of the growth
of the table size.
- As there is no more normalization step, code should now handle both strings
and byte slices as input. Introduced source type to facilitate this.
NOTES
- This change does not handle normalization correctly entirely for contractions.
This causes a few failures with the regtest. table_test.go contains a few
uncommented tests that can be enabled once this is fixed. The easiest is to
fix this once we have the new norm.Iter.
- Removed a test cases in table_test that covers cases that are now guaranteed
to not exist.
R=rsc, mpvl
CC=golang-dev
https://golang.org/cl/6971044
compare incrementally. Also modified collation API to be more high-level
by removing the need for an explicit buffer to be passed as an argument.
This considerably speeds up Compare and CompareString. This change also eliminates
the need to reinitialize the normalization buffer for each use of an iter. This
also significantly improves performance for Key and KeyString.
R=r, rsc
CC=golang-dev
https://golang.org/cl/6842050
key and simple comparisson. Search is not yet implemented in this CL.
Changed some of the types of table_test.go to allow reuse in the new test.
Also reduced number of primary values for illegal runes to 1 (both map to
the same).
R=r
CC=golang-dev
https://golang.org/cl/6202062