Command set:
- what: an extremely fast query that parses a single
file and returns the AST stack, package name and the
set of query modes that apply to the current selection.
Intended for GUI tools that need to grey out UI elements.
- definition: shows the definition of an identifier.
- pointsto: the PTA features of 'describe' have been split
out into their own command.
- describe: with PTA stripped out, the cost is now bounded by
type checking.
Performance:
- The importer.Config.TypeCheckFuncBodies predicate supports
setting the 'IgnoreFuncBodies' typechecker flag on a
per-package basis. This means we can load dependencies from
source more quickly if we only need exported types.
(We avoid gcimport data because it may be absent or stale.)
This also means we can run type-based queries on packages
that aren't part of the pointer analysis scope. (Yay.)
- Modes that require only type analysis of the query package
run a "what" query first, and restrict their analysis scope
to just that package and its dependencies (sans func
bodies), making them much faster.
- We call newOracle not oracle.New in Query, so that the
'needs' bitset isn't ignored (oops!). This makes the
non-PTA queries faster.
Also:
- removed vestigial timers junk.
- pos.go: existing position utilties split out into own file.
Added parsePosFlag utility.
- numerous cosmetic tweaks.
+ very basic tests.
To do in follow-ups:
- sophisticated editor integration of "what".
- better tests.
- refactoring of control flow as described in comment.
- changes to "implements", "describe" commands.
- update design doc + user manual.
R=crawshaw, dominik.honnef
CC=golang-dev, gri
https://golang.org/cl/40630043
This avoids leaking nodeids into client code before the
analysis has had a chance to run the (forthcoming) constraint
optimizer, which renumbers them.
R=crawshaw
CC=golang-dev
https://golang.org/cl/39410043
(Elminate premature abstraction.)
The test probes used Pointer!=nil for the "is pointerlike"
predicate. Now that Pointer is a struct, they check the type
of the expression, which is more accurate. Two probes on
non-pointerlike values have beem removed.
R=crawshaw
CC=golang-dev
https://golang.org/cl/38420043
- fixed a couple of TODOs
- various cleanups along the way
- adjusted clients
Once submitted, clients of go/types that don't explicitly
specify Config.Import will need to add the extra import:
import _ "code.google.com/p/go.tools/go/gcimporter"
to install the default (gc) importer in go/types.
R=adonovan, gri
CC=golang-dev
https://golang.org/cl/26390043
The implementation follows the basic pattern of an indirect
function call (genDynamicCall).
We use the same trick as SetFinalizer so that direct calls to
(r.V).Call, which are overwhelmingly the norm, are inlined.
Bug fix (and simplification): calling untag() to unbox a
reflect.Value is wrong for reflect.Values containing interfaces
(rare). Now, we call untag for concrete types and typeFilter
for interface types, and we can use this pattern in all cases.
It corresponds to the ssa.TypeAssert operator, so we call
it typeAssert. Added tests to cover this.
We also specialize reflect.{In,Out} when the operand is an int
literal.
+ Tests.
Also:
- make taggedValue() panic, not return nil, eliminating many checks.
We call isTaggedValue for the one place that cares.
- pointer_test: recover from panics in Analyze() and dump the log.
R=crawshaw
CC=golang-dev
https://golang.org/cl/14426050
reflect.Values may point to tagged objects with
interface type, e.g. x := reflect.ValueOf(new(interface{})).Elem().
We failed to consider this when implementing Elem.
Also, (reflect.Value).Interface() must do one "unboxing"
when it encounters such tagged objects.
i.e., x.Elem().Interface() and x.Interface() are equivalent
in that case.
Also:
- add example of tagged object with interface type.
- untabify (Label).String docstring.
- added tests.
R=crawshaw
CC=golang-dev
https://golang.org/cl/18020044
This allows us to run/analyze multiple tests.
Also it causes the production code packages to be properly initialized.
Also:
- cmd/ssadump: improved usage message (add example;
incorporate LoadInitialPackages usage; explain how -run
finds main).
- pointer, oracle, ssa/interp: use CreateTestMainPackage.
- ssa/builder.go: remove 'rundefers' instruction from package init,
which no longer uses 'defer'.
R=gri
CC=golang-dev
https://golang.org/cl/15920047
Motivation:
Previously, we assumed that the set of types for which a
complete method set (containing all synthesized wrapper
functions) is required at runtime was the set of types
used as operands to some *ssa.MakeInterface instruction.
In fact, this is an underapproximation because types can
be derived from other ones via reflection, and some of
these may need methods. The reflect.Type API allows *T to
be derived from T, and these may have different method
sets. Reflection also allows almost any subcomponent of a
type to be accessed (with one exception: given T, defined
'type T struct{S}', you can reach S but not struct{S}).
As a result, the pointer analysis was unable to generate
all necessary constraints before running the solver,
causing a crash when reflection derives types whose
methods are unavailable. (A similar problem would afflict
an ahead-of-time compiler based on ssa. The ssa/interp
interpreter was immune only because it does not require
all wrapper methods to be created before execution
begins.)
Description:
This change causes the SSA builder to record, for each
package, the set of all types with non-empty method sets that
are referenced within that package. This set is accessed via
Packages.TypesWithMethodSets(). Program.TypesWithMethodSets()
returns its union across all packages.
The set of references that matter are:
- types of operands to some MakeInterface instruction (as before)
- types of all exported package members
- all subcomponents of the above, recursively.
This is a conservative approximation to the set of types
whose methods may be called dynamically.
We define the owning package of a type as follows:
- the owner of a named type is the package in which it is defined;
- the owner of a pointer-to-named type is the owner of that named type;
- the owner of all other types is nil.
A package must include the method sets for all types that it
owns, and all subcomponents of that type that are not owned by
another package, recursively. Types with an owner appear in
exactly one package; types with no owner (such as struct{T})
may appear within multiple packages.
(A typical Go compiler would emit multiple copies of these
methods as weak symbols; a typical linker would eliminate
duplicates.)
Also:
- go/types/typemap: implement hash function for *Tuple.
- pointer: generate nodes/constraints for all of
ssa.Program.TypesWithMethodSets().
Add rtti.go regression test.
- Add API test of Package.TypesWithMethodSets().
- Set Function.Pkg to nil (again) for wrapper functions,
since these may be shared by many packages.
- Remove a redundant logging statement.
- Document that ssa CREATE phase is in fact sequential.
Fixesgolang/go#6605
R=gri
CC=golang-dev
https://golang.org/cl/14920056
Support for:
(*reflect.rtype).Field
(*reflect.rtype).FieldByName
reflect.MakeSlice
runtime.SetFinalizer
Details:
- analysis locates ssa.Functions for (reflect.Value).Call
and runtime.SetFinalizer during startup to that it can
special-case them during genCall. ('Call' is forthcoming.)
- The callsite.targets mechanism is only used for dynamic
calls now. For static calls we call callEdge during constraint
generation; this is a minor optimisation.
- Static calls to SetFinalizer are inlined so that the call
appears to go direct to the finalizer. (We'll use the same
trick for (reflect.Value).Call.)
- runtime.FuncForPC: treat as a no-op.
- Fixed pointer_test to properly deal with expectations
that are multi-sets.
- Inlined rtypeMethodByNameConstraint.addMethod.
- More tests.
R=crawshaw
CC=golang-dev
https://golang.org/cl/14682045
Various reflect operations permit assignability conversions,
i.e. their internals behave unlike y=x.(T) which unpacks only
those interface values in x that are identical to T.
We split typeAssertConstraint y=x.(T) into two constraints:
1) typeFilter, for when T is an interface type and no
representation change occurs.
2) unpack, for when T is a concrete type and the payload of the
tagged object is extracted. This constraint has an 'exact'
parameter indicating whether to use the predicate
IsIdentical (for type assertions) or
IsAssignable (for reflect operators).
+ Tests.
R=crawshaw
CC=golang-dev
https://golang.org/cl/14547043
(reflect.Value).Bytes
(reflect.Value).Elem
(reflect.Value).Index
(reflect.Value).SetBytes
(reflect.Value).Slice
reflect.PtrTo
reflect.SliceOf
+ Tests.
Also: comment out an 'info-'level print statement in the test; it was distracting.
R=crawshaw
CC=golang-dev
https://golang.org/cl/14454055
The new method is functionally identical to typeCheck, and
obviates the LoadMainPackage method.
Updated all clients.
Fixes bug 6561.
R=gri
CC=golang-dev
https://golang.org/cl/14494051
This information can be used to specialize such calls, e.g.
- report location of unsound calls (done for reflect.NewAt)
- exploit argument information (done for constant 'dir' parameter to reflect.ChanOf)
+ tests.
R=crawshaw
CC=golang-dev
https://golang.org/cl/14517046
Since the Go runtime treats it specially, so must the pointer analysis.
Details:
- Combine object.{val,typ} fields into 'data interface{}'.
It may now hold a string, describing an instrinsically
allocated object such as the command-line args.
- extend Label accordingly; add Label.ReflectType() accessor.
Also: document pointer analysis algorithm classification.
R=crawshaw
CC=golang-dev
https://golang.org/cl/14156043
Details:
- Warnings are reported as values in Result, not a callback in Config.
- remove TODO to eliminate Print callback. It's better than the alternative.
- remove unused Config.root field.
- hang Result off analysis object (impl. detail)
- reword TODO.
R=crawshaw
CC=golang-dev
https://golang.org/cl/14128043
Motivation: simple constraints---copy and addr---are more
amenable to pre-solver optimizations (forthcoming) than
complex constraints: load, store, and all others.
In code such as the following:
t0 = new struct { x, y int }
t1 = &t0.y
t2 = *t1
there's no need for the full generality of a (complex)
load constraint for t2=*t1 since t1 can only point to t0.y.
All we need is a (simple) copy constraint t2 = (t0.y)
where (t0.y) is the object node label for that field.
For all "addressable" SSA instructions, we tabulate
whether their points-to set is necessarily a singleton. For
some (e.g. Alloc, MakeSlice, etc) this is always true by
design. For others (e.g. FieldAddr) it depends on their
operands.
We exploit this information when generating constraints:
all load-form and store-form constraints are reduced to copy
constraints if the pointer's PTS is a singleton.
Similarly all FieldAddr (y=&x.f) and IndexAddr (y=&x[0])
constraints are reduced to offset addition, for singleton
operands.
Here's the constraint mix when running on the oracle itself.
The total number of constraints is unchanged but the fraction
that are complex has gone down to 21% from 53%.
before after
--simple--
addr 20682 46949
copy 61454 91211
--complex--
offsetAddr 41621 15325
load 18769 12925
store 30758 6908
invoke 758 760
typeAssert 1688 1689
total 175832 175869
Also:
- Add Pointer.Context() for local variables,
since we now plumb cgnodes throughout. Nice.
- Refactor all load-form (load, receive, lookup) and
store-form (Store, send, MapUpdate) constraints to use
genLoad and genStore.
- Log counts of constraints by type.
- valNodes split into localval and globalval maps;
localval is purged after each function.
- analogous maps localobj[v] and globalobj[v] hold sole label
for pts(v), if singleton.
- fnObj map subsumed by globalobj.
- make{Function/Global/Constant} inlined into objectValue.
Much cleaner.
R=crawshaw
CC=golang-dev
https://golang.org/cl/13979043
The previous CL made the assumption that Root is the first
node, which is false for programs that import "reflect".
Reverted to the previous way: an explicit root field.
Added regression test (callgraph2) via oracle.
R=crawshaw
TBR=crawshaw
CC=golang-dev
https://golang.org/cl/13967043
Also: pointer.Analyze now returns a pointer.Result object,
containing the callgraph and the results of ssa.Value queries.
The oracle has been updated to use the new call and pointer APIs.
R=crawshaw, gri
CC=golang-dev
https://golang.org/cl/13915043
(reflect.Value).Send
(reflect.Value).TrySend
(reflect.Value).Recv
(reflect.Value).TryRecv
(reflect.Type).ChanOf
(reflect.Type).In
(reflect.Type).Out
reflect.Indirect
reflect.MakeChan
Also:
- specialize genInvoke when the receiver is a reflect.Type under the
assumption that there's only one possible concrete type. This
makes all reflect.Type operations context-sensitive since the calls
are no longer dynamic.
- Rename all variables to match the actual parameter names used in
the reflect API.
- Add pointer.Config.Reflection flag
(exposed in oracle as --reflect, default false) to enable reflection.
It currently adds about 20% running time. I'll make it true after
the presolver is implemented.
- Simplified worklist datatype and solver main loop slightly
(~10% speed improvement).
- Use addLabel() utility to add a label to a PTS.
(Working on my 3 yr old 2x2GHz+4GB Mac vs 8x4GHz+24GB workstation,
one really notices the cost of pointer analysis.
Note to self: time to implement presolver.)
R=crawshaw
CC=golang-dev
https://golang.org/cl/13242062
Core:
reflect.TypeOf
reflect.ValueOf
reflect.Zero
reflect.Value.Interface
Maps:
(reflect.Value).MapIndex
(reflect.Value).MapKeys
(reflect.Value).SetMapIndex
(*reflect.rtype).Elem
(*reflect.rtype).Key
+ tests:
pointer/testdata/mapreflect.go.
oracle/testdata/src/main/reflection.go.
Interface objects (T, V...) have been renamed "tagged objects".
Abstraction: we model reflect.Value similar to
interface{}---as a pointer that points only to tagged
objects---but a reflect.Value may also point to an "indirect
tagged object", one in which the payload V is of type *T not T.
These are required because reflect.Values can hold lvalues,
e.g. when derived via Field() or Elem(), though we won't use
them till we get to structs and pointers.
Solving: each reflection intrinsic defines a new constraint
and resolution rule. Because of the nature of reflection,
generalizing across types, the resolution rules dynamically
create additional complex constraints during solving, where
previously only simple (copy) constraints were created.
This requires some solver changes:
The work done before the main solver loop (to attach new
constraints to the graph) is now done before each iteration,
in processNewConstraints.
Its loop over constraints is broken into two passes:
the first handles base (addr-of) constraints,
the second handles simple and complex constraints.
constraint.init() has been inlined. The only behaviour that
varies across constraints is ptr()
Sadly this will pessimize presolver optimisations, when we get
there; such is the price of reflection.
Objects: reflection intrinsics create objects (i.e. cause
memory allocations) with no SSA operation. We will represent
them as the cgnode of the instrinsic (e.g. reflect.New), so we
extend Labels and node.data to represent objects as a product
(not sum) of ssa.Value and cgnode and pull this out into its
own type, struct object. This simplifies a number of
invariants and saves space. The ntObject flag is now
represented by obj!=nil; the other flags are moved into
object.
cgnodes are now always recorded in objects/Labels for which it
is appropriate (all but those for globals, constants and the
shared contours for functions).
Also:
- Prepopulate the flattenMemo cache to consider reflect.Value
a fake pointer, not a struct.
- Improve accessors and documentation on type Label.
- @conctypes assertions renamed @types (since dyn. types needn't be concrete).
- add oracle 'describe' test on an interface (missing, an oversight).
R=crawshaw
CC=golang-dev
https://golang.org/cl/13418048
This is a short-term usability measure.
Longer term, we need to audit each conversion to decide
whether it should be ignored or modelled by an analytic
summary.
R=crawshaw
CC=golang-dev
https://golang.org/cl/13263050
All have been audited to ensure that they have NoEffect on
aliasing. Also: clarify the requirements for NoEffect to
explicitly disclaim trivial loads/stores.
R=crawshaw
CC=golang-dev
https://golang.org/cl/13314045
Background: some ssa.Values represent lvalues, e.g.
var g = new(string)
the *ssa.Global g is a **string, the address of what users
think of as the global g.
Querying pts(g) returns a singleton containing the object g, a
*string. What users really want to see is what that in turn
points to, i.e. the label for the call to new().
This change now lets users make "indirect" pointer queries,
i.e. for pts(*v) where v is an ssa.Value. The oracle makes an
indirect query if the type of the ssa.Value differs from the
source expression type by a pointer, i.e. it's an lvalue.
In other words, we're hiding the fact that compilers (e.g. ssa) internally represent globals by their address.
+ Tests.
This serendipitously fixed an outstanding bug mentioned in the
describe.go
R=crawshaw
CC=golang-dev
https://golang.org/cl/13532043
Motivation: pointer analysis tools (like the oracle) want the
user to specify a set of initial packages, like 'go test'.
This change enables the user to specify a set of packages on
the command line using importer.LoadInitialPackages(args).
Each argument is interpreted as either:
- a comma-separated list of *.go source files together
comprising one non-importable ad-hoc package.
e.g. "src/pkg/net/http/triv.go" gives us [main].
- an import path, denoting both the imported package
and its non-importable external test package, if any.
e.g. "fmt" gives us [fmt, fmt_test].
Current type-checker limitations mean that only the first
import path may contribute tests: multiple packages augmented
by *_test.go files could create import cycles, which 'go test'
avoids by building a separate executable for each one.
That approach is less attractive for static analysis.
Details: (many files touched, but importer.go is the crux)
importer:
- PackageInfo.Importable boolean indicates whether
package is importable.
- un-expose Importer.Packages; expose AllPackages() instead.
- CreatePackageFromArgs has become LoadInitialPackages.
- imports() moved to util.go, renamed importsOf().
- InitialPackagesUsage usage message exported to clients.
- the package name for ad-hoc packages now comes from the
'package' decl, not "main".
ssa.Program:
- added CreatePackages() method
- PackagesByPath un-exposed, renamed 'imported'.
- expose AllPackages and ImportedPackage accessors.
oracle:
- describe: explain and workaround a go/types bug.
Misc:
- Removed various unnecessary error.Error() calls in Printf args.
R=crawshaw
CC=golang-dev
https://golang.org/cl/13579043
1. ParseFiles (in util.go) parses each file in its own goroutine.
2. (*Importer).LoadPackage asynchronously prefetches the
import graph by scanning the imports of each loaded package
and calling LoadPackage on each one.
LoadPackage is now thread-safe and idempotent: it uses a
condition variable per package; the first goroutine to
request a package becomes responsible for loading it and
broadcasts to the others (waiting) when it becomes ready.
ssadump runs 34% faster when loading the oracle.
Also, refactorings:
- delete SourceLoader mechanism; just expose go/build.Context directly.
- CreateSourcePackage now also returns an error directly,
rather than via PackageInfo.Err, since every client wants that.
R=crawshaw
CC=golang-dev
https://golang.org/cl/13509045
See json.go for interface specification.
Example usage:
% oracle -format=json -mode=callgraph code.google.com/p/go.tools/cmd/oracle
+ Tests, based on (small) golden files.
Overview:
Each <query>Result structure has been "lowered" so that all
but the most trivial logic in each display() function has
been moved to the main query.
Each one now has a toJSON method that populates a json.Result
struct. Though the <query>Result structs are similar to the
correponding JSON protocol, they're not close enough to be
used directly; for example, the former contain richer
semantic entities (token.Pos, ast.Expr, ssa.Value,
pointer.Pointer, etc) whereas JSON contains only their
printed forms using Go basic types.
The choices of what levels of abstractions the two sets of
structs should have is somewhat arbitrary. We may want
richer information in the JSON output in future.
Details:
- oracle.Main has been split into oracle.Query() and the
printing of the oracle.Result.
- the display() method no longer needs an *oracle param, only
a print function.
- callees: sort the result for determinism.
- callees: compute the union across all contexts.
- callers: sort the results for determinism.
- describe(package): fixed a bug in the predicate for method
accessibility: an unexported method defined in pkg A may
belong to a type defined in package B (via
embedding/promotion) and may thus be accessible to A. New
accessibleMethods() utility fixes this.
- describe(type): filter methods by accessibility.
- added tests of 'callgraph'.
- pointer: eliminated the 'caller CallGraphNode' parameter from
pointer.Context.Call callback since it was redundant w.r.t
site.Caller().
- added warning if CGO_ENABLED is unset.
R=crawshaw
CC=golang-dev
https://golang.org/cl/13270045
+ Tests.
+ Emacs integration.
+ Emacs integration test.
+ very rudimentary Vim integration. Needs some love from a Vim user.
TODO (in follow-ups):
- More tests would be good.
We'll need to make the output order deterministic in more places.
- Documentation.
R=gri, crawshaw, dominik.honnef
CC=golang-dev
https://golang.org/cl/9502043
Previously, if the result was not wanted, the received
(value, ok) tuple had no type for 'value'.
Now it is always set to the channel's element type.
Also: set the position on such receive instructions to that of
the = or := token, and document it.
+ (indirect) test via pointer analysis.
R=crawshaw, gri
CC=golang-dev
https://golang.org/cl/12956052