mirror of
https://github.com/golang/go
synced 2024-11-25 11:17:56 -07:00
264 lines
8.3 KiB
Plaintext
264 lines
8.3 KiB
Plaintext
|
Compiling and Linking
|
||
|
----
|
||
|
|
||
|
Assume we have:
|
||
|
|
||
|
- one or more source files, *.go, perhaps in different directories
|
||
|
- a compiler, C. it takes one .go file and generates a .o file.
|
||
|
- a linker, L, it takes one or more .o files and generates a go.out (!) file.
|
||
|
|
||
|
There is a question around naming of the files. Let's avoid that
|
||
|
problem for now and state that if the input is X.go, the output of
|
||
|
the compiler is X.o, ignoring the package declaration in the file.
|
||
|
This is not current behavior and probably not correct behavior, but
|
||
|
it keeps the exposition simpler.
|
||
|
|
||
|
Let's also assume that the linker knows about the run time and we
|
||
|
don't have to specify bootstrap and runtime linkage explicitly.
|
||
|
|
||
|
|
||
|
Basics
|
||
|
----
|
||
|
|
||
|
Given a single file, main.go, with no dependencies, we do:
|
||
|
|
||
|
C main.go # compile
|
||
|
L main.o # link
|
||
|
go.out # run
|
||
|
|
||
|
Now let's say that main.go contains
|
||
|
|
||
|
import "fmt"
|
||
|
|
||
|
and that fmt.go contains
|
||
|
|
||
|
import "sys"
|
||
|
|
||
|
Then to build, we must compile in dependency order:
|
||
|
|
||
|
C sys.go
|
||
|
C fmt.go
|
||
|
C main.go
|
||
|
|
||
|
and then link
|
||
|
|
||
|
L main.o fmt.o sys.o
|
||
|
|
||
|
To the linker itself, the order of arguments is unimportant.
|
||
|
|
||
|
When we compile fmt.go, we need to know the details of the functions
|
||
|
(etc.) exported by sys.go and used by fmt.go. When we run
|
||
|
|
||
|
C fmt.go
|
||
|
|
||
|
it discovers the import of sys, and must then read sys.o to discover
|
||
|
the details. We must therefore compile the exporting source file before we
|
||
|
can compile the importing source. Moreover, if there is a mismatch
|
||
|
between export and import, we can discover it during compilation
|
||
|
of the importing source.
|
||
|
|
||
|
To be explicit, then, what we say is, in effect
|
||
|
|
||
|
C sys.go
|
||
|
C fmt.go sys.o
|
||
|
C main.go fmt.o sys.o
|
||
|
L main.o fmt.o sys.o
|
||
|
|
||
|
|
||
|
The contents of .o files (I)
|
||
|
----
|
||
|
|
||
|
It's necessary to include in fmt.o the information for linking
|
||
|
against the functions etc. in sys.o. It's also possible to identify
|
||
|
sys.o explicitly inside fmt.o, so we need to say only
|
||
|
|
||
|
L main.o fmt.o
|
||
|
|
||
|
with sys.o discovered automatically. Iterating again, it's easy
|
||
|
to reduce the link step to
|
||
|
|
||
|
L main.o
|
||
|
|
||
|
with L discovering automatically the .o files it needs to process
|
||
|
to create the final go.out.
|
||
|
|
||
|
|
||
|
Automation of dependencies (I)
|
||
|
----
|
||
|
|
||
|
It should be possible to automate discovery of the dependencies of
|
||
|
main.go and therefore the order necessary to compile. Since the
|
||
|
source files contain explicit import statements, it is possible,
|
||
|
given a source file, to discover the dependency tree automatically.
|
||
|
(This will require rules and/or conventions about where to find
|
||
|
things; for now assume everything is in the same directory.)
|
||
|
|
||
|
The program that does this might possibly be a variant of the
|
||
|
compiler, since it must parse import statements at least, but for
|
||
|
clarity let's call it D for dependency. It can be a little like
|
||
|
make, but let's not call it make because that brings along properties
|
||
|
we don't want. In particular, it reads the sources to discover the
|
||
|
dependencies; it doesn't need a separate description such as a
|
||
|
Makefile.
|
||
|
|
||
|
In a directory with the source files above, including main.go, but
|
||
|
with no .o files, we say:
|
||
|
|
||
|
D main.go
|
||
|
|
||
|
D reads main.go, finds the import for fmt, and in effect descends,
|
||
|
automatically running
|
||
|
|
||
|
D fmt.go
|
||
|
|
||
|
which in turn invokes
|
||
|
|
||
|
D sys.go
|
||
|
|
||
|
The file sys.go has no dependencies, so it can be compiled; D
|
||
|
therefore says in effect
|
||
|
|
||
|
"compile sys.go"
|
||
|
|
||
|
and returns; then we have what we need for fmt.go since the exports
|
||
|
in sys.go are known (or at least the recipe to discover them is
|
||
|
known). So the next level says
|
||
|
|
||
|
"compile fmt.go"
|
||
|
|
||
|
and pops up, whereupon the top D says
|
||
|
|
||
|
"compile main.go"
|
||
|
|
||
|
The output of D could therefore be described as a script to run to
|
||
|
compile the source.
|
||
|
|
||
|
We could imagine that instead, D actually runs the compiler.
|
||
|
(Conversely, we could imagine that C uses D to make sure the
|
||
|
dependencies are built, but that has the danger of causing unnecessary
|
||
|
dependency checking and compilation; more on that later.)
|
||
|
|
||
|
To build, therefore, all we need to say is:
|
||
|
|
||
|
D -c main.go # -c means 'run the compiler'
|
||
|
L main.o
|
||
|
|
||
|
Obviously, D at this stage could just run L. Therefore, we can
|
||
|
simplify further by having it do so, whereupon
|
||
|
|
||
|
D -c main.go
|
||
|
|
||
|
can automate the complete compilation and linking process.
|
||
|
|
||
|
Automation of dependencies (II)
|
||
|
----
|
||
|
|
||
|
Let's say we now edit main.go without changing its imports. To
|
||
|
recompile, we have two options. First, we could be explicit:
|
||
|
|
||
|
C main.go
|
||
|
|
||
|
Or we could use D to automate running the compiler, as described
|
||
|
in the previous section:
|
||
|
|
||
|
D -c main.go
|
||
|
|
||
|
The D command will discover the import of fmt, but can see that fmt.o
|
||
|
already exists. Assuming its existence implies its currency, it need
|
||
|
go no further; it can invoke C to compile main.go and link as usual.
|
||
|
Whether it should make this assumption might be controlled by a flag.
|
||
|
For the purpose of discussion, let's say it makes the assumption if
|
||
|
the -c flag is set.
|
||
|
|
||
|
There are two implications to this scheme. First, running D when D
|
||
|
is going to turn around and run C anyway implies we could just run
|
||
|
C directly and save one command invocation. (We could decide
|
||
|
independently whether C should automatically invoke the linker.)
|
||
|
|
||
|
The other implication is more interesting. If we stop traversing
|
||
|
the dependency hierarchy as soon as we discover a .o file, then we
|
||
|
may not realize that fmt.o is out of date and link against a stale
|
||
|
binary. To fix this problem, we need to stat() or checksum the .o
|
||
|
and .go files to see if they need recompilation. Doing this every
|
||
|
time is expensive and gets us back into the make-like approach.
|
||
|
|
||
|
The great majority of compilations do not require this full check,
|
||
|
however; this is especially true when in the compile-debug-edit
|
||
|
cycle. We therefore propose splitting the model into two scenarios.
|
||
|
|
||
|
Scenario 1: General
|
||
|
|
||
|
In this scenario, we ask D to update the full dependency tree by
|
||
|
stat()-ing or checksumming files to check currency. The generated
|
||
|
go.out will always be up to date but incremental compilation will
|
||
|
be slower. Typically, this will be necessary only after a major
|
||
|
operation like syncing or checking out code, or if there are known
|
||
|
changes being made to the dependencies.
|
||
|
|
||
|
Scenario 2: Fast
|
||
|
|
||
|
In this scenario, we explicitly tell D -c what has changed and have
|
||
|
it compile only what is required. Typically, this will mean compiling
|
||
|
only the single active file or maybe a few files. If an IDE is
|
||
|
present or there is some watcher tool, it's easy to avoid the common
|
||
|
mistake of forgetting to compile a changed file.
|
||
|
|
||
|
If an edit has caused skew between export and import, this will be
|
||
|
caught by the compiler, so it should be type-safe at least. If D is
|
||
|
running the compilation, it might be possible to arrange that C tells
|
||
|
it there is a dependency problem and have D then try to resolve it
|
||
|
by reevaluation.
|
||
|
|
||
|
|
||
|
The contents of .o files (II)
|
||
|
----
|
||
|
|
||
|
For scenario 2, we can make things even faster if the .o files
|
||
|
identify not just the files that must be imported to satisfy the
|
||
|
imports, but details about the imports themselves. Let's say main.go
|
||
|
uses only one function from fmt.go, called F. If the compiled main.o
|
||
|
says, in effect
|
||
|
|
||
|
from package fmt get F
|
||
|
|
||
|
then the linker will not need to read all of fmt.o to link main.o;
|
||
|
instead it can extract only the necessary function.
|
||
|
|
||
|
Even better, if fmt is a package made of many files, it may be
|
||
|
possible to store in main.o specific information about the exact
|
||
|
files needed:
|
||
|
|
||
|
from file fmtF.o get F
|
||
|
|
||
|
The linker can then not even bother opening the other .o files that
|
||
|
form package fmt.
|
||
|
|
||
|
The compiler should therefore be explicit and detailed within the .o
|
||
|
files it generates about what elements of a package are needed by
|
||
|
the program being compiled.
|
||
|
|
||
|
Earlier, we said that when we run
|
||
|
|
||
|
C fmt.go
|
||
|
|
||
|
it discovers the import of sys, and must then read sys.o to discover
|
||
|
the details. Note that if we record the information as specified here,
|
||
|
when we then do
|
||
|
|
||
|
C main.go
|
||
|
|
||
|
and it reads fmt.o, it does not in turn need to read sys.o; the necessary
|
||
|
information has already been pulled up into fmt.o by D.
|
||
|
|
||
|
Thus, once the dependency information is properly constructed, to
|
||
|
compile a program X.go we must read X.go plus N .o files, where N
|
||
|
is the number of packages explicitly imported by X.go. The transitive
|
||
|
closure need not be evaluated to compile a file, only the explicit
|
||
|
imports. By this result, we hope to dramatically reduce the amount
|
||
|
of I/O necessary to compile a Go source file.
|
||
|
|
||
|
To put this another way, if a package P imports packages Xi, the
|
||
|
existence of Xi.o files is all that is needed to compile P because the
|
||
|
Xi.o files contain the export information. This is what breaks the
|
||
|
transitive dependency closure.
|