mirror of
https://github.com/golang/go
synced 2024-11-21 17:54:39 -07:00
Add description of how compiling and linking handle dependencies.
SVN=115807
This commit is contained in:
parent
8cdb71017a
commit
b806ba4d88
263
doc/candl.txt
Normal file
263
doc/candl.txt
Normal file
@ -0,0 +1,263 @@
|
||||
Compiling and Linking
|
||||
----
|
||||
|
||||
Assume we have:
|
||||
|
||||
- one or more source files, *.go, perhaps in different directories
|
||||
- a compiler, C. it takes one .go file and generates a .o file.
|
||||
- a linker, L, it takes one or more .o files and generates a go.out (!) file.
|
||||
|
||||
There is a question around naming of the files. Let's avoid that
|
||||
problem for now and state that if the input is X.go, the output of
|
||||
the compiler is X.o, ignoring the package declaration in the file.
|
||||
This is not current behavior and probably not correct behavior, but
|
||||
it keeps the exposition simpler.
|
||||
|
||||
Let's also assume that the linker knows about the run time and we
|
||||
don't have to specify bootstrap and runtime linkage explicitly.
|
||||
|
||||
|
||||
Basics
|
||||
----
|
||||
|
||||
Given a single file, main.go, with no dependencies, we do:
|
||||
|
||||
C main.go # compile
|
||||
L main.o # link
|
||||
go.out # run
|
||||
|
||||
Now let's say that main.go contains
|
||||
|
||||
import "fmt"
|
||||
|
||||
and that fmt.go contains
|
||||
|
||||
import "sys"
|
||||
|
||||
Then to build, we must compile in dependency order:
|
||||
|
||||
C sys.go
|
||||
C fmt.go
|
||||
C main.go
|
||||
|
||||
and then link
|
||||
|
||||
L main.o fmt.o sys.o
|
||||
|
||||
To the linker itself, the order of arguments is unimportant.
|
||||
|
||||
When we compile fmt.go, we need to know the details of the functions
|
||||
(etc.) exported by sys.go and used by fmt.go. When we run
|
||||
|
||||
C fmt.go
|
||||
|
||||
it discovers the import of sys, and must then read sys.o to discover
|
||||
the details. We must therefore compile the exporting source file before we
|
||||
can compile the importing source. Moreover, if there is a mismatch
|
||||
between export and import, we can discover it during compilation
|
||||
of the importing source.
|
||||
|
||||
To be explicit, then, what we say is, in effect
|
||||
|
||||
C sys.go
|
||||
C fmt.go sys.o
|
||||
C main.go fmt.o sys.o
|
||||
L main.o fmt.o sys.o
|
||||
|
||||
|
||||
The contents of .o files (I)
|
||||
----
|
||||
|
||||
It's necessary to include in fmt.o the information for linking
|
||||
against the functions etc. in sys.o. It's also possible to identify
|
||||
sys.o explicitly inside fmt.o, so we need to say only
|
||||
|
||||
L main.o fmt.o
|
||||
|
||||
with sys.o discovered automatically. Iterating again, it's easy
|
||||
to reduce the link step to
|
||||
|
||||
L main.o
|
||||
|
||||
with L discovering automatically the .o files it needs to process
|
||||
to create the final go.out.
|
||||
|
||||
|
||||
Automation of dependencies (I)
|
||||
----
|
||||
|
||||
It should be possible to automate discovery of the dependencies of
|
||||
main.go and therefore the order necessary to compile. Since the
|
||||
source files contain explicit import statements, it is possible,
|
||||
given a source file, to discover the dependency tree automatically.
|
||||
(This will require rules and/or conventions about where to find
|
||||
things; for now assume everything is in the same directory.)
|
||||
|
||||
The program that does this might possibly be a variant of the
|
||||
compiler, since it must parse import statements at least, but for
|
||||
clarity let's call it D for dependency. It can be a little like
|
||||
make, but let's not call it make because that brings along properties
|
||||
we don't want. In particular, it reads the sources to discover the
|
||||
dependencies; it doesn't need a separate description such as a
|
||||
Makefile.
|
||||
|
||||
In a directory with the source files above, including main.go, but
|
||||
with no .o files, we say:
|
||||
|
||||
D main.go
|
||||
|
||||
D reads main.go, finds the import for fmt, and in effect descends,
|
||||
automatically running
|
||||
|
||||
D fmt.go
|
||||
|
||||
which in turn invokes
|
||||
|
||||
D sys.go
|
||||
|
||||
The file sys.go has no dependencies, so it can be compiled; D
|
||||
therefore says in effect
|
||||
|
||||
"compile sys.go"
|
||||
|
||||
and returns; then we have what we need for fmt.go since the exports
|
||||
in sys.go are known (or at least the recipe to discover them is
|
||||
known). So the next level says
|
||||
|
||||
"compile fmt.go"
|
||||
|
||||
and pops up, whereupon the top D says
|
||||
|
||||
"compile main.go"
|
||||
|
||||
The output of D could therefore be described as a script to run to
|
||||
compile the source.
|
||||
|
||||
We could imagine that instead, D actually runs the compiler.
|
||||
(Conversely, we could imagine that C uses D to make sure the
|
||||
dependencies are built, but that has the danger of causing unnecessary
|
||||
dependency checking and compilation; more on that later.)
|
||||
|
||||
To build, therefore, all we need to say is:
|
||||
|
||||
D -c main.go # -c means 'run the compiler'
|
||||
L main.o
|
||||
|
||||
Obviously, D at this stage could just run L. Therefore, we can
|
||||
simplify further by having it do so, whereupon
|
||||
|
||||
D -c main.go
|
||||
|
||||
can automate the complete compilation and linking process.
|
||||
|
||||
Automation of dependencies (II)
|
||||
----
|
||||
|
||||
Let's say we now edit main.go without changing its imports. To
|
||||
recompile, we have two options. First, we could be explicit:
|
||||
|
||||
C main.go
|
||||
|
||||
Or we could use D to automate running the compiler, as described
|
||||
in the previous section:
|
||||
|
||||
D -c main.go
|
||||
|
||||
The D command will discover the import of fmt, but can see that fmt.o
|
||||
already exists. Assuming its existence implies its currency, it need
|
||||
go no further; it can invoke C to compile main.go and link as usual.
|
||||
Whether it should make this assumption might be controlled by a flag.
|
||||
For the purpose of discussion, let's say it makes the assumption if
|
||||
the -c flag is set.
|
||||
|
||||
There are two implications to this scheme. First, running D when D
|
||||
is going to turn around and run C anyway implies we could just run
|
||||
C directly and save one command invocation. (We could decide
|
||||
independently whether C should automatically invoke the linker.)
|
||||
|
||||
The other implication is more interesting. If we stop traversing
|
||||
the dependency hierarchy as soon as we discover a .o file, then we
|
||||
may not realize that fmt.o is out of date and link against a stale
|
||||
binary. To fix this problem, we need to stat() or checksum the .o
|
||||
and .go files to see if they need recompilation. Doing this every
|
||||
time is expensive and gets us back into the make-like approach.
|
||||
|
||||
The great majority of compilations do not require this full check,
|
||||
however; this is especially true when in the compile-debug-edit
|
||||
cycle. We therefore propose splitting the model into two scenarios.
|
||||
|
||||
Scenario 1: General
|
||||
|
||||
In this scenario, we ask D to update the full dependency tree by
|
||||
stat()-ing or checksumming files to check currency. The generated
|
||||
go.out will always be up to date but incremental compilation will
|
||||
be slower. Typically, this will be necessary only after a major
|
||||
operation like syncing or checking out code, or if there are known
|
||||
changes being made to the dependencies.
|
||||
|
||||
Scenario 2: Fast
|
||||
|
||||
In this scenario, we explicitly tell D -c what has changed and have
|
||||
it compile only what is required. Typically, this will mean compiling
|
||||
only the single active file or maybe a few files. If an IDE is
|
||||
present or there is some watcher tool, it's easy to avoid the common
|
||||
mistake of forgetting to compile a changed file.
|
||||
|
||||
If an edit has caused skew between export and import, this will be
|
||||
caught by the compiler, so it should be type-safe at least. If D is
|
||||
running the compilation, it might be possible to arrange that C tells
|
||||
it there is a dependency problem and have D then try to resolve it
|
||||
by reevaluation.
|
||||
|
||||
|
||||
The contents of .o files (II)
|
||||
----
|
||||
|
||||
For scenario 2, we can make things even faster if the .o files
|
||||
identify not just the files that must be imported to satisfy the
|
||||
imports, but details about the imports themselves. Let's say main.go
|
||||
uses only one function from fmt.go, called F. If the compiled main.o
|
||||
says, in effect
|
||||
|
||||
from package fmt get F
|
||||
|
||||
then the linker will not need to read all of fmt.o to link main.o;
|
||||
instead it can extract only the necessary function.
|
||||
|
||||
Even better, if fmt is a package made of many files, it may be
|
||||
possible to store in main.o specific information about the exact
|
||||
files needed:
|
||||
|
||||
from file fmtF.o get F
|
||||
|
||||
The linker can then not even bother opening the other .o files that
|
||||
form package fmt.
|
||||
|
||||
The compiler should therefore be explicit and detailed within the .o
|
||||
files it generates about what elements of a package are needed by
|
||||
the program being compiled.
|
||||
|
||||
Earlier, we said that when we run
|
||||
|
||||
C fmt.go
|
||||
|
||||
it discovers the import of sys, and must then read sys.o to discover
|
||||
the details. Note that if we record the information as specified here,
|
||||
when we then do
|
||||
|
||||
C main.go
|
||||
|
||||
and it reads fmt.o, it does not in turn need to read sys.o; the necessary
|
||||
information has already been pulled up into fmt.o by D.
|
||||
|
||||
Thus, once the dependency information is properly constructed, to
|
||||
compile a program X.go we must read X.go plus N .o files, where N
|
||||
is the number of packages explicitly imported by X.go. The transitive
|
||||
closure need not be evaluated to compile a file, only the explicit
|
||||
imports. By this result, we hope to dramatically reduce the amount
|
||||
of I/O necessary to compile a Go source file.
|
||||
|
||||
To put this another way, if a package P imports packages Xi, the
|
||||
existence of Xi.o files is all that is needed to compile P because the
|
||||
Xi.o files contain the export information. This is what breaks the
|
||||
transitive dependency closure.
|
@ -123,7 +123,7 @@ func Test() {
|
||||
for i := 0; i < v.Len(); i++ {
|
||||
var x *I;
|
||||
x = v.At(i);
|
||||
print i, " ", x.val, "\n"; // BUG: can't use I(v.At(i))
|
||||
print i, " ", x.val, "\n";
|
||||
}
|
||||
}
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user