- Changed Check signature to take function parameters for
more flexibility: Now a client can interrupt type checking
early (via panic in one the upcalls) once the desired
type information or number of errors is reached. Default
use is still simple.
- Cleaned up main typechecking loops. Now does not neglect
_ declarations anymore.
- Various other cleanups.
R=golang-dev, r, rsc
CC=golang-dev
https://golang.org/cl/6612049
This code has been reviewed before. The most significant
change is to check_test which now can handle more than
one error at the same error position (due to spurious
errors - should not happen in praxis once error handling
has been fine-tuned). This change makes check_test easier
to use during development.
R=rsc
CC=golang-dev
https://golang.org/cl/6584057
This code relies on some functions that are not yet in staging,
but it get's harder to keep all this in sync in a piece-meal
fashion.
R=rsc
CC=golang-dev
https://golang.org/cl/6492124
More pieces of the typechecker code:
- Operands are temporary objects representing an expressions's
type and value (for constants). An operand is the equivalent of
an "attribute" in attribute grammars except that it's not stored
but only passed around during type checking.
- Constant operations are implemented in const.go. Constants are
represented as bool (booleans), int64 and *big.Int (integers),
*big.Rat (floats), complex (complex numbers), and string (strings).
- Error reporting is consolidated in errors.go. Only the first
dozen of lines is new code, the rest of the file contains the
exprString and typeString functions formerly in two separate
files (which have been removed).
This is a replacement CL for 6492101 (which was created without
proper use of hg).
R=rsc, r
CC=golang-dev
https://golang.org/cl/6500114
various implementation of collation. The tool provides commands for soring,
regressing one implementation against another, and benchmarking.
Currently it includes collation implementations for the Go collator, ICU,
and one using Darwin's CoreFoundation framework.
To avoid building this tool in the default build, the colcmp tag has been
added to all files. This allows other tools/colcmp in this directory (e.g. it may make
sense to move maketables here) to be put in this directory as well.
R=r, rsc, mpvl
CC=golang-dev
https://golang.org/cl/6496118
instead of variables. Several reasons:
- Encourage users of the API to minimize the number of creations and reuse Collate objects.
- Don't rule out the possibility of using initialization code for collators. For some locales
it will be possible to have very compact representations that can be quickly expanded
into a proper table on demand.
Other changes:
- Change name of root* vars to main*, as the tables are shared between locales.
- Added Locales() method to get a list of supported locales.
R=r
CC=golang-dev
https://golang.org/cl/6498107
This fixes a problem with ELF tools thinking they know the
format of the symbol table, as we do not use any of the
standard formats for that table.
This change will probably annoy the Plan 9 users, but I
believe there are other incompatibilities already that mean
they have to use a Go-specific nm.
Fixes#3473.
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/6500117
First set of type checker files for review.
The primary concern here is the typechecker
API (types.go).
R=rsc, adonovan, r, rogpeppe
CC=golang-dev
https://golang.org/cl/6490089
Refactored build + buildTrie into build + buildOrdering.
Note that since the tailoring code is not checked in yet, all tailorings are identical
to root. The table therefore should not and does not grow at this point.
R=r
CC=golang-dev
https://golang.org/cl/6500087
for both locale-specific exemplar characters and tailorings to
the collation table.
Some specifices:
- Moved stringSet to the beginning of the file and added some functionality
to parse command line files.
- openReader now modifies the input URL for localFiles to guarantee that
any http source listed in the generated file is indeed this source.
- Note that the implementation of the Tailoring API used by maketables.go
is not yet checked in. So for now adding tailorings are simply no-ops.
- The generated file of exemplar characters will be used somewhere else.
Here is a snippet of how the body of the generated file looks like:
type exemplarType int
const (
exCharacters exemplarType = iota
exContractions
exPunctuation
exAuxiliary
exCurrency
exIndex
exN
)
var exemplarCharacters = map[string][exN]string{
"af": [exN]string{
0: "a á â b c d e é è ê ë f g h i î ï j k l m n o ô ö p q r s t u û v w x y z",
3: "á à â ä ã æ ç é è ê ë í ì î ï ó ò ô ö ú ù û ü ý",
4: "a b c d e f g h i j k l m n o p q r s t u v w x y z",
},
...
}
R=r
CC=golang-dev
https://golang.org/cl/6501070
- Elements in the array are now sorted as a linked list. This makes it easier to
apply tailorings.
- Added code to sort entries by collation elements.
- Added logical reset points. This is used for tailoring relative to certain
properties, rather than characters.
NOTE: all code for type entry should now be in order.go. To keep the diffs for
this CL reasonable, though, the existing code is left in builder.go. I'll move
this in a separate CL.
R=r
CC=golang-dev
https://golang.org/cl/6493063
Also rename Node.{Add,Remove} to Node.{AppendChild,RemoveChild} to
be consistent with the DOM.
benchmark old ns/op new ns/op delta
BenchmarkParser 4042040 3749618 -7.23%
benchmark old MB/s new MB/s speedup
BenchmarkParser 19.34 20.85 1.08x
BenchmarkParser mallocs per iteration is also:
10495 before / 7992 after
R=andybalholm, r, adg
CC=golang-dev
https://golang.org/cl/6495061
In the regtest data, surrogates are assigned primary weights based on
the surrogate code point value. Go now converts surrogates to FFFD, however,
meaning that the primary weight is based on this code point instead.
This change drops tests with surrogates and lets the tests pass.
R=r
CC=golang-dev
https://golang.org/cl/6461100
with table changes.
NOTE: there is no test for this, but 1) the code has now the same
control flow as scan in exp/locale/collate/contract.go, which is
tested and 2) Builder verifies the generated table so bugs in this
code are quickly and easily found (which is how this bug was discovered).
R=r
CC=golang-dev
https://golang.org/cl/6461082
The main table will need to get a slightly different collation table as the one
used by regtest, as the regtest is based on the standard UCA DUCET, while
the locale-specific tables are all based on a CLDR root table.
This change allows changing the table without affecting the regression test.
R=r
CC=golang-dev
https://golang.org/cl/6453089
Now that the parser passes all tests in the test suite,
it is no longer necessary to keep track of which tests
pass and which don't. So remove the testlogs directory
and the code that uses it.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6453124
All of the remaining tests that had as status of PARSE rather than PASS had
good reasons for not passing the render-and-reparse step: the correct parse tree is
badly formed, so when it is rendered out as HTML, the result doesn't parse into the
same tree. So add them to the list of tests where that step is skipped.
Also, I discovered that it is possible to end up with HTML elements (not just text)
inside a raw text element through reparenting. So change the rendering routines to
handle that situation as sensibly as possible (which still isn't very sensible, but
this is HTML5).
R=nigeltao
CC=golang-dev
https://golang.org/cl/6446137
When generating replacement elements for an <isindex> tag, the old
addSyntheticElement method was producing the wrong nesting. Replace
it with parseImpliedToken.
Pass the one remaining test in the test suite.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6453114
If a tag doesn't have a closing '>', it isn't considered a tag;
it is just ignored and EOF is returned instead.
Pass one additional test in the test suite.
Change tokenizer tests to match correct behavior.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6454131
In HTML content, having a self-closing tag is a parse error unless
the tag would be self-closing anyway (like <img>). The only place a
self-closing tag actually makes a difference is in XML-based foreign
content.
Pass 1 additional test.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6450109
Also simplified parsing of interface
types since they can only contain
methods (and no embedded interfaces)
in the export data.
R=rsc
CC=golang-dev
https://golang.org/cl/6446084
If a table contained whitespace, text nodes would not get foster parented
correctly.
Pass 1 additional test.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6459054
The <title> element was getting removed from the stack of open elements,
when its parent, the <head> element should have been removed instead.
Pass 2 additional tests.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6449101
When an element (like <nobr> or <p>) was implicitly closed by another
start tag, it would keep foster parenting from working because
the check for what was on top of the stack of open elements was
in the wrong place.
Move the check to addChild.
Pass 2 additional tests.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6460045
Since NUL usually terminates strings in underlying syscalls, allowing
it when converting string arguments is a security risk, especially
when dealing with filenames. For example, a program might reason that
filename like "/root/..\x00/" is a subdirectory or "/root/" and allow
access to it, while underlying syscall will treat "\x00" as an end of
that string and the actual filename will be "/root/..", which might
be unexpected. Returning EINVAL when string arguments have NUL in
them makes sure this attack vector is unusable.
R=golang-dev, r, bradfitz, fullung, rsc, minux.ma
CC=golang-dev
https://golang.org/cl/6458050
The content of an HTML <title> element is RCDATA, but the content of an SVG
<title> element is parsed as tags. Now the parser doesn't go into RCDATA
mode in foreign content.
Pass 4 additional tests.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6448111
for dealing with CLDR files:
- Add now taxes a list of indexes of colelems that are variables. Checking and
handling is now done by the Builder. VariableTop is now also properly generated
using the Build method.
- Introduced separate Builder, called Tailoring, for creating tailorings of root
table. This clearly separates the functionality for building a table based on
weights (the allkeys* files) versus tables based on LDML XML files.
- Tailorings are now added by two calls instead of one: SetAnchor and Insert.
This more closely reflects the structure of LDML side and simplifies the
implementation of both the client and library side. It also preserves
some information that is otherwise hard to recover for the Builder.
- Allow the LDML XML element extend to be passed to Insert. This simplifies
both client and library implementation.
R=r
CC=golang-dev
https://golang.org/cl/6454061
Process a package's object in a reproducible
order (rather then in map order) so that we
get error messages in reproducible order.
R=r
CC=golang-dev
https://golang.org/cl/6449076
The text inside <script> tags is not ordinary raw text; there are all sorts
of other complications. This CL implements those complications.
Pass 76 additional tests.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6443070
This is more in sync with the rest of the package;
for instance, we have functions (not methods) to
deref or find the underlying type of a Type.
In the process use a single bytes.Buffer to create
the string representation for a type rather than
the (occasional) string concatenation.
R=r
CC=golang-dev
https://golang.org/cl/6458057