Everybody either gets confused and thinks this is
TrimLeft/TrimRight or does this by hand which gets
repetitive looking.
R=rsc, kevlar
CC=golang-dev
https://golang.org/cl/7239044
ICU and collate package: ICU requires strings to be in FCD form.
Not all NFC strings are in this form, leading to incorrect results.
Change to NFD instead.
R=rsc
CC=golang-dev
https://golang.org/cl/7201043
We explicitly spill all parameters to the frame during initial
SSA construction. (Later passes will remove spills.)
We now properly handle local Allocs escaping via Captures.
Also: allocate BasicBlock.Succs inline.
R=iant, iant
CC=golang-dev
https://golang.org/cl/7231050
Add 'math/big' to blacklist of packages that use shift
operations as yet unsupported by go/types.
(The failure was masked due to local bugfixes in my client.)
R=rsc, bradfitz, bradfitz
CC=golang-dev
https://golang.org/cl/7220057
This CL includes the implementation of Literal, all the
Value.String and Instruction.String methods, the sanity
checker, and other misc utilities.
R=gri, iant, iant
CC=golang-dev
https://golang.org/cl/7199052
It is now possible to run "go test -cpu=1,2,4 std"
successfully.
Fixes#3185.
R=golang-dev, dave, minux.ma, bradfitz
CC=golang-dev
https://golang.org/cl/7196052
further to how (I believe) it will end up being.
It is nicer to separate search from sorting functionality. Collation needs tables that
are not needed by search and vice-versa. The common functionality is separated out
in the Weigher interface. As this interface is very low-level, it will be moved to
a sub package (colltab) in a next CL.
The types that will move to this package are Weigher, Elem, and Level. The addition
of Elem allows for removing some of the duplicate code between collate and collate/build.
This CL also introduces some stubs for a higher-level API for options. The default
proposed options are quite complex and require the user to have a decent understanding
of Unicode collation. The new options hide a lot of the complexity.
R=rsc
CC=golang-dev
https://golang.org/cl/7058051
Calling it will show memory allocation statistics for that
single benchmark (if -test.benchmem is not provided)
R=golang-dev, rsc, kevlar, bradfitz
CC=golang-dev
https://golang.org/cl/7027046
I think that the parser is complete enough to take that warning out.
It passes the test suite.
There may be incompatible API changes, but being in the exp directory
is warning enough for that.
R=nigeltao
CC=golang-dev
https://golang.org/cl/7131050
Completely removed *ast.Objects from being exposed by the
types API. *ast.Objects are still required internally for
resolution, but now the door is open for an internal-only
rewrite of identifier resolution entirely at type-check
time. Once that is done, ASTs can be type-checked whether
they have been created via the go/parser or otherwise,
and type-checking does not require *ast.Object or scope
invariants to be maintained externally.
R=adonovan
CC=golang-dev
https://golang.org/cl/7096048
The existing type checker was relying on augmenting ast.Object
fields (empty interfaces) for its purposes. While this worked
for some time now, it has become increasingly brittle. Also,
the need for package information for Fields and Methods would
have required a new field in each ast.Object. Rather than making
them bigger and the code even more subtle, in this CL we are moving
away from ast.Objects.
The types packge now defines its own objects for different
language entities (Const, Var, TypeName, Func), and they
implement the types.Object interface. Imported packages
create a Package object which holds the exported entities
in a types.Scope of types.Objects.
For type-checking, the current package is still using ast.Objects
to make this transition manageable. In a next step, the type-
checker will also use types.Objects instead, which opens the door
door to resolving ASTs entirely by the type checker. As a result,
the AST and type checker become less entangled, and ASTs can be
manipulated "by hand" or programmatically w/o having to worry
about scope and object invariants that are very hard to maintain.
(As a consequence, a future parser can do less work, and a
future AST will not need to define objects and scopes anymore.
Also, object resolution which is now split across the parser,
the ast, (ast.NewPackage), and even the type checker (for composite
literal keys) can be done in a single place which will be simpler
and more efficient.)
Change details:
- Check now takes a []*ast.File instead of a map[string]*ast.File.
It's easier to handle (I deleted code at all use sites) and does
not suffer from undefined order (which is a pain for testing).
- ast.Object.Data is now a *types.Package rather then an *ast.Scope
if the object is a package (obj.Kind == ast.Pkg). Eventually this
will go away altogether.
- Instead of an ast.Importer, Check now uses a types.Importer
(which returns a *types.Package).
- types.NamedType has two object fields (Obj Object and obj *ast.Object);
eventually there will be only Obj. The *ast.Object is needed during
this transition since a NamedType may refer to either an imported
(using types.Object) or locally defined (using *ast.Object) type.
- ast.NewPackage is not used anymore - there's a local copy for
package-level resolution of imports.
- struct fields now take the package origin into account.
- The GcImporter is now returning a *types.Package. It cannot be
used with ast.NewPackage anymore. If that functionality is still
used, a copy of the old GcImporter should be made locally (note
that GcImporter was part of exp/types and it's API was not frozen).
- dot-imports are not handled for the time being (this will come back).
R=adonovan
CC=golang-dev
https://golang.org/cl/7058060
bytes.Equal is simpler to read and should also be faster because
of short-circuiting and assembly implementations.
Change generated automatically using:
gofmt -r 'bytes.Compare(a, b) == 0 -> bytes.Equal(a, b)'
gofmt -r 'bytes.Compare(a, b) != 0 -> !bytes.Equal(a, b)'
R=golang-dev, dave, adg, rsc
CC=golang-dev
https://golang.org/cl/7038051
This is a just a file move with no other changes
besides the manual import path adjustments in these
two files:
src/pkg/exp/gotype/gotype.go
src/pkg/exp/gotype/gotype_test.go
Note: The go/types API continues to be subject to
possibly significant changes until Go 1.1. Do not
rely on it being stable at this point.
R=adonovan
CC=golang-dev
https://golang.org/cl/7013049
The parser/resolver cannot accurately resolve
composite literal keys that are identifiers;
it needs type information.
Instead, try to resolve them but leave final
judgement to the type checker.
R=adonovan
CC=golang-dev
https://golang.org/cl/6994047
- added Context type for configuration of type checker
- type check all function and method bodies
- (partial) fixes to shift hinting (still not complete)
- revamped test harness - does not rely on specific position
representation anymore, just a standard (compiler) error
message
- lots of bug fixes
R=adonovan, rsc
CC=golang-dev
https://golang.org/cl/6948071
Motivations:
- Simpler UI. Previous API proved a bit awkward for practical purposes.
- Iter is often used in cases where one want to be able to bail out early.
The old implementaton had too much look-ahead to be efficient.
Disadvantages:
- ASCII performance is bad. This is unavoidable for tiny iterations.
Example is included to show how to work around this.
Description:
Iter now iterates per boundary/segment. It returns a slice of bytes that
either points to the input bytes, the internal decomposition strings,
or the small internal buffer that each iterator has. In many cases, copying
bytes is avoided.
The method Seek was added to support jumping around the input without
having to reinitialize.
Details:
- Table adjustments: some decompositions exist of multiple segments.
Decompositions that are of this type are now marked so that Iter can
handle them separately.
- The old iterator had a different next function for different normal forms
that was assigned to a function pointer called by Next.
The new iterator uses this mechanism to switch between different modes
for handling different type of input as well. This greatly improves
performance for Hangul and ASCII. It is also used for multi-segment
decompositions.
- input is now a struct of sting and []byte, instead of an interface.
This simplifies optimizing the ASCII case.
R=rsc
CC=golang-dev
https://golang.org/cl/6873072
the need to decompose characters for the majority of cases. This considerably
speeds up collation while increasing the table size minimally.
To detect non-normalized strings, rather than relying on exp/norm, the table
now includes CCC information. The inclusion of this information does not
increase table size.
DETAILS
- Raw collation elements are now a struct that includes the CCC, rather
than a slice of ints.
- Builder now ensures that NFD and NFC counterparts are included in the table.
This also fixes a bug for Korean which is responsible for most of the growth
of the table size.
- As there is no more normalization step, code should now handle both strings
and byte slices as input. Introduced source type to facilitate this.
NOTES
- This change does not handle normalization correctly entirely for contractions.
This causes a few failures with the regtest. table_test.go contains a few
uncommented tests that can be enabled once this is fixed. The easiest is to
fix this once we have the new norm.Iter.
- Removed a test cases in table_test that covers cases that are now guaranteed
to not exist.
R=rsc, mpvl
CC=golang-dev
https://golang.org/cl/6971044
Details:
- fixed variadic parameter passing and calls of the form f(g())
- fixed implementation of ^x for unsigned constants x
- fixed assignability of untyped booleans
- resolved a few TODOs, various minor fixes
- enabled many more tests (only 6 std packages don't typecheck)
R=rsc
CC=golang-dev
https://golang.org/cl/6930053
gotype can now handle much of the standard library.
- marked packages which have type checker issues
- this CL depends on CL 6846131
R=rsc
CC=golang-dev
https://golang.org/cl/6850130
Also:
- better handling of type assertions
- implemented built-in error type
- first cut at handling variadic function signatures
- several bug fixes
R=rsc, rogpeppe
CC=golang-dev
https://golang.org/cl/6846131
This CL defines the API. Implementation will come in follow-up CLs.
Update #1960.
R=bradfitz, dr.volker.dobler, rsc
CC=golang-dev
https://golang.org/cl/6849092
also:
- composite literal checking close to complete
- cleaned up parameter, method, field checking
- don't let panics escape type checker
- more TODOs eliminated
R=rsc
CC=golang-dev
https://golang.org/cl/6816083
The exp/types packages does not support the gccgo export data
format. At some point it should, but not yet.
R=gri, bradfitz, r, iant, dsymonds
CC=golang-dev
https://golang.org/cl/6854068
compare incrementally. Also modified collation API to be more high-level
by removing the need for an explicit buffer to be passed as an argument.
This considerably speeds up Compare and CompareString. This change also eliminates
the need to reinitialize the normalization buffer for each use of an iter. This
also significantly improves performance for Key and KeyString.
R=r, rsc
CC=golang-dev
https://golang.org/cl/6842050
There was an init race between
check_test.go:init
universe.go:def
use of Universe
and
universe.go:init
creation of Universe
The order in which init funcs are executed in a package is unspecified.
The test is not currently broken in the golang.org environment
because the go tool compiles the test with non-test sources before test sources,
but other environments may, say, sort the source files before compiling,
and thus trigger this race, causing a nil pointer panic.
R=gri
CC=golang-dev
https://golang.org/cl/6827076
Avoids problems with local declarations shadowing other names.
We write a more explicit form than the incoming program, so there
may be additional type annotations. For example:
int := "hello"
j := 2
would normally turn into
var int string = "hello"
var j int = 2
but the int variable shadows the int type in the second line.
This CL marks all local variables with a per-function sequence number,
so that this would instead be:
var int·1 string = "hello"
var j·2 int = 2
Fixes#4326.
R=ken2
CC=golang-dev
https://golang.org/cl/6816100
The only code change is in exp/gotype/gotype.go.
The latest reviewed version of exp/types is now
exp/types/staging.
First step toward replacing exp/types with
exp/types/staging.
R=iant
CC=golang-dev
https://golang.org/cl/6819071
- simplified assignment checking by removing duplicate code
- implemented field lookup (methods, structs, embedded fields)
- importing methods (not just parsing them)
- type-checking functions and methods
- typechecking more statements (inc/dec, select, return)
- tracing support for easier debugging
- handling nil more correctly (comparisons)
- initial support for [...]T{} arrays
- initial support for method expressions
- lots of bug fixes
All packages under pkg/go as well as pkg/exp/types typecheck
now with pkg/exp/gotype applied to them; i.e., a significant
amount of typechecking works now (several statements are not
implemented yet, but handling statements is almost trivial in
comparison with typechecking expressions).
R=rsc
CC=golang-dev
https://golang.org/cl/6768063
Tailorings are represented by removing and reinserting entries from a linked list.
After all tailorings are done, missing weights are computed and verified.
This implementation assumes that entries that are used in expansions are not
reinserted at a later point. This considerably simplifies the implementation.
R=r
CC=golang-dev
https://golang.org/cl/6739052
incremental comparisons. Instead, processing is now done directly on colElems.
As a result, the size of the weights array is now reduced by 75%.
Details:
- Primary value of type 1 colElem is shifted by 1 bit so that primaries
of all types can be compared without shifting.
- Quaternary values are now stored in the colElem itself. This is possible
as quaternary values other than 0 or maxQuaternary are only needed when other
values are ignored.
- Simplified processWeights by removing cases that are needed for ICU but not
for us (our CJK primary values fit in a single value).
R=r
CC=golang-dev
https://golang.org/cl/6817054
- Allow secondary values below the default value in second form. This is
to support before tags for secondary values, as used by Chinese.
- Eliminate collation elements that are guaranteed to be immaterial
after a weight increment.
R=r
CC=golang-dev
https://golang.org/cl/6739051
Since this patch changes the way complex literals are written
in export data, there are a few other glitches.
Fixes#4159.
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6674047
Preparation for forthcoming CL 6624051: Will make it
easier to see if/what changes are incurred by it.
The alignment changes in this CL are due to CL 6610051
(fix to alignment heuristic) where it appears that an
old version of gofmt was run (and thus the correct
alignment updates were not done).
R=r
CC=golang-dev
https://golang.org/cl/6639044
- Changed Check signature to take function parameters for
more flexibility: Now a client can interrupt type checking
early (via panic in one the upcalls) once the desired
type information or number of errors is reached. Default
use is still simple.
- Cleaned up main typechecking loops. Now does not neglect
_ declarations anymore.
- Various other cleanups.
R=golang-dev, r, rsc
CC=golang-dev
https://golang.org/cl/6612049
This code has been reviewed before. The most significant
change is to check_test which now can handle more than
one error at the same error position (due to spurious
errors - should not happen in praxis once error handling
has been fine-tuned). This change makes check_test easier
to use during development.
R=rsc
CC=golang-dev
https://golang.org/cl/6584057
This code relies on some functions that are not yet in staging,
but it get's harder to keep all this in sync in a piece-meal
fashion.
R=rsc
CC=golang-dev
https://golang.org/cl/6492124
More pieces of the typechecker code:
- Operands are temporary objects representing an expressions's
type and value (for constants). An operand is the equivalent of
an "attribute" in attribute grammars except that it's not stored
but only passed around during type checking.
- Constant operations are implemented in const.go. Constants are
represented as bool (booleans), int64 and *big.Int (integers),
*big.Rat (floats), complex (complex numbers), and string (strings).
- Error reporting is consolidated in errors.go. Only the first
dozen of lines is new code, the rest of the file contains the
exprString and typeString functions formerly in two separate
files (which have been removed).
This is a replacement CL for 6492101 (which was created without
proper use of hg).
R=rsc, r
CC=golang-dev
https://golang.org/cl/6500114
various implementation of collation. The tool provides commands for soring,
regressing one implementation against another, and benchmarking.
Currently it includes collation implementations for the Go collator, ICU,
and one using Darwin's CoreFoundation framework.
To avoid building this tool in the default build, the colcmp tag has been
added to all files. This allows other tools/colcmp in this directory (e.g. it may make
sense to move maketables here) to be put in this directory as well.
R=r, rsc, mpvl
CC=golang-dev
https://golang.org/cl/6496118
instead of variables. Several reasons:
- Encourage users of the API to minimize the number of creations and reuse Collate objects.
- Don't rule out the possibility of using initialization code for collators. For some locales
it will be possible to have very compact representations that can be quickly expanded
into a proper table on demand.
Other changes:
- Change name of root* vars to main*, as the tables are shared between locales.
- Added Locales() method to get a list of supported locales.
R=r
CC=golang-dev
https://golang.org/cl/6498107
This fixes a problem with ELF tools thinking they know the
format of the symbol table, as we do not use any of the
standard formats for that table.
This change will probably annoy the Plan 9 users, but I
believe there are other incompatibilities already that mean
they have to use a Go-specific nm.
Fixes#3473.
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/6500117
First set of type checker files for review.
The primary concern here is the typechecker
API (types.go).
R=rsc, adonovan, r, rogpeppe
CC=golang-dev
https://golang.org/cl/6490089
Refactored build + buildTrie into build + buildOrdering.
Note that since the tailoring code is not checked in yet, all tailorings are identical
to root. The table therefore should not and does not grow at this point.
R=r
CC=golang-dev
https://golang.org/cl/6500087
for both locale-specific exemplar characters and tailorings to
the collation table.
Some specifices:
- Moved stringSet to the beginning of the file and added some functionality
to parse command line files.
- openReader now modifies the input URL for localFiles to guarantee that
any http source listed in the generated file is indeed this source.
- Note that the implementation of the Tailoring API used by maketables.go
is not yet checked in. So for now adding tailorings are simply no-ops.
- The generated file of exemplar characters will be used somewhere else.
Here is a snippet of how the body of the generated file looks like:
type exemplarType int
const (
exCharacters exemplarType = iota
exContractions
exPunctuation
exAuxiliary
exCurrency
exIndex
exN
)
var exemplarCharacters = map[string][exN]string{
"af": [exN]string{
0: "a á â b c d e é è ê ë f g h i î ï j k l m n o ô ö p q r s t u û v w x y z",
3: "á à â ä ã æ ç é è ê ë í ì î ï ó ò ô ö ú ù û ü ý",
4: "a b c d e f g h i j k l m n o p q r s t u v w x y z",
},
...
}
R=r
CC=golang-dev
https://golang.org/cl/6501070
- Elements in the array are now sorted as a linked list. This makes it easier to
apply tailorings.
- Added code to sort entries by collation elements.
- Added logical reset points. This is used for tailoring relative to certain
properties, rather than characters.
NOTE: all code for type entry should now be in order.go. To keep the diffs for
this CL reasonable, though, the existing code is left in builder.go. I'll move
this in a separate CL.
R=r
CC=golang-dev
https://golang.org/cl/6493063
Also rename Node.{Add,Remove} to Node.{AppendChild,RemoveChild} to
be consistent with the DOM.
benchmark old ns/op new ns/op delta
BenchmarkParser 4042040 3749618 -7.23%
benchmark old MB/s new MB/s speedup
BenchmarkParser 19.34 20.85 1.08x
BenchmarkParser mallocs per iteration is also:
10495 before / 7992 after
R=andybalholm, r, adg
CC=golang-dev
https://golang.org/cl/6495061
In the regtest data, surrogates are assigned primary weights based on
the surrogate code point value. Go now converts surrogates to FFFD, however,
meaning that the primary weight is based on this code point instead.
This change drops tests with surrogates and lets the tests pass.
R=r
CC=golang-dev
https://golang.org/cl/6461100
with table changes.
NOTE: there is no test for this, but 1) the code has now the same
control flow as scan in exp/locale/collate/contract.go, which is
tested and 2) Builder verifies the generated table so bugs in this
code are quickly and easily found (which is how this bug was discovered).
R=r
CC=golang-dev
https://golang.org/cl/6461082
The main table will need to get a slightly different collation table as the one
used by regtest, as the regtest is based on the standard UCA DUCET, while
the locale-specific tables are all based on a CLDR root table.
This change allows changing the table without affecting the regression test.
R=r
CC=golang-dev
https://golang.org/cl/6453089
Now that the parser passes all tests in the test suite,
it is no longer necessary to keep track of which tests
pass and which don't. So remove the testlogs directory
and the code that uses it.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6453124
All of the remaining tests that had as status of PARSE rather than PASS had
good reasons for not passing the render-and-reparse step: the correct parse tree is
badly formed, so when it is rendered out as HTML, the result doesn't parse into the
same tree. So add them to the list of tests where that step is skipped.
Also, I discovered that it is possible to end up with HTML elements (not just text)
inside a raw text element through reparenting. So change the rendering routines to
handle that situation as sensibly as possible (which still isn't very sensible, but
this is HTML5).
R=nigeltao
CC=golang-dev
https://golang.org/cl/6446137
When generating replacement elements for an <isindex> tag, the old
addSyntheticElement method was producing the wrong nesting. Replace
it with parseImpliedToken.
Pass the one remaining test in the test suite.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6453114
If a tag doesn't have a closing '>', it isn't considered a tag;
it is just ignored and EOF is returned instead.
Pass one additional test in the test suite.
Change tokenizer tests to match correct behavior.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6454131
In HTML content, having a self-closing tag is a parse error unless
the tag would be self-closing anyway (like <img>). The only place a
self-closing tag actually makes a difference is in XML-based foreign
content.
Pass 1 additional test.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6450109
Also simplified parsing of interface
types since they can only contain
methods (and no embedded interfaces)
in the export data.
R=rsc
CC=golang-dev
https://golang.org/cl/6446084
If a table contained whitespace, text nodes would not get foster parented
correctly.
Pass 1 additional test.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6459054
The <title> element was getting removed from the stack of open elements,
when its parent, the <head> element should have been removed instead.
Pass 2 additional tests.
R=nigeltao
CC=golang-dev
https://golang.org/cl/6449101