mirror of
https://github.com/golang/go
synced 2024-11-11 21:30:23 -07:00
doc: add section for checksum database to module reference doc
Updates #33637 Change-Id: Ia782b3fdc5a8873606b96120a34c9bf194a1a346 Reviewed-on: https://go-review.googlesource.com/c/go/+/211197 Reviewed-by: Jay Conrod <jayconrod@google.com>
This commit is contained in:
parent
73b657e96e
commit
a65f088301
162
doc/modules.md
162
doc/modules.md
@ -276,6 +276,10 @@ list. The `go` command does not fall back to later proxies in response to other
|
||||
4xx and 5xx errors. This allows a proxy to act as a gatekeeper, for example, by
|
||||
responding with error 403 (Forbidden) for modules not on an approved list.
|
||||
|
||||
<!-- TODO(katiehockman): why only fall back for 410/404? Either add the details
|
||||
here, or write a blog post about how to build multiple types of proxies. e.g.
|
||||
a "privacy preserving one" and an "authorization one" -->
|
||||
|
||||
The table below specifies queries that a module proxy must respond to. For each
|
||||
path, `$base` is the path portion of a proxy URL,`$module` is a module path, and
|
||||
`$version` is a version. For example, if the proxy URL is
|
||||
@ -425,12 +429,170 @@ setting `GOPROXY` to `https://example.com/proxy`.
|
||||
<a id="authenticating"></a>
|
||||
## Authenticating modules
|
||||
|
||||
<!-- TODO: continue this section -->
|
||||
When deciding whether to trust the source code for a module version just
|
||||
fetched from a proxy or origin server, the `go` command first consults the
|
||||
`go.sum` lines in the `go.sum` file of the current module. If the `go.sum` file
|
||||
does not contain an entry for that module version, then it may consult the
|
||||
checksum database.
|
||||
|
||||
<a id="go.sum-file-format"></a>
|
||||
### go.sum file format
|
||||
|
||||
<a id="checksum-database"></a>
|
||||
### Checksum database
|
||||
|
||||
The checksum database is a global source of `go.sum` lines. The `go` command can
|
||||
use this in many situations to detect misbehavior by proxies or origin servers.
|
||||
|
||||
The checksum database allows for global consistency and reliability for all
|
||||
publicly available module versions. It makes untrusted proxies possible since
|
||||
they can't serve the wrong code without it going unnoticed. It also ensures
|
||||
that the bits associated with a specific version do not change from one day to
|
||||
the next, even if the module's author subsequently alters the tags in their
|
||||
repository.
|
||||
|
||||
The checksum database is served by [sum.golang.org](https://sum.golang.org),
|
||||
which is run by Google. It is a [Transparent
|
||||
Log](https://research.swtch.com/tlog) (or “Merkle Tree”) of `go.sum` line
|
||||
hashes, which is backed by [Trillian](https://github.com/google/trillian). The
|
||||
main advantage of a Merkle tree is that independent auditors can verify that it
|
||||
hasn't been tampered with, so it is more trustworthy than a simple database.
|
||||
|
||||
The `go` command interacts with the checksum database using the protocol
|
||||
originally outlined in [Proposal: Secure the Public Go Module
|
||||
Ecosystem](https://go.googlesource.com/proposal/+/master/design/25530-sumdb.md#checksum-database).
|
||||
|
||||
The table below specifies queries that the checksum database must respond to.
|
||||
For each path, `$base` is the path portion of the checksum database URL,
|
||||
`$module` is a module path, and `$version` is a version. For example, if the
|
||||
checksum database URL is `https://sum.golang.org`, and the client is requesting
|
||||
the record for the module `golang.org/x/text` at version `v0.3.2`, the client
|
||||
would send a `GET` request for
|
||||
`https://sum.golang.org/lookup/golang.org/x/text@v0.3.2`.
|
||||
|
||||
To avoid ambiguity when serving from case-insensitive file systems,
|
||||
the `$module` and `$version` elements are
|
||||
[case-encoded](https://pkg.go.dev/golang.org/x/mod/module#EscapePath)
|
||||
by replacing every uppercase letter with an exclamation mark followed by the
|
||||
corresponding lower-case letter. This allows modules `example.com/M` and
|
||||
`example.com/m` to both be stored on disk, since the former is encoded as
|
||||
`example.com/!m`.
|
||||
|
||||
Parts of the path surrounded by square brakets, like `[.p/$W]` denote optional
|
||||
values.
|
||||
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Path</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td><code>$base/latest</code></td>
|
||||
<td>
|
||||
Returns a signed, encoded tree description for the latest log. This
|
||||
signed description is in the form of a
|
||||
<a href="https://pkg.go.dev/golang.org/x/mod/sumdb/note">note</a>,
|
||||
which is text that has been signed by one or more server keys and can
|
||||
be verified using the server's public key. The tree description
|
||||
provides the size of the tree and the hash of the tree head at that
|
||||
size. This encoding is described in
|
||||
<code><a href="https://pkg.go.dev/golang.org/x/mod/sumdb/tlog#FormatTree">
|
||||
golang.org/x/mod/sumdb/tlog#FormatTree</a></code>.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<tr>
|
||||
<td><code>$base/lookup/$module@$version</code></td>
|
||||
<td>
|
||||
Returns the log record number for the entry about <code>$module</code>
|
||||
at <code>$version</code>, followed by the data for the record (that is,
|
||||
the <code>go.sum</code> lines for <code>$module</code> at
|
||||
<code>$version</code>) and a signed, encoded tree description that
|
||||
contains the record.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<tr>
|
||||
<td><code>$base/tile/$H/$L/$K[.p/$W]</code></td>
|
||||
<td>
|
||||
Returns a [log tile](https://research.swtch.com/tlog#serving_tiles),
|
||||
which is a set of hashes that make up a section of the log. Each tile
|
||||
is defined in a two-dimensional coordinate at tile level
|
||||
<code>$L</code>, <code>$K</code>th from the left, with a tile height of
|
||||
<code>$H</code>. The optional <code>.p/$W</code> suffix indicates a
|
||||
partial log tile with only <code>$W</code> hashes. Clients must fall
|
||||
back to fetching the full tile if a partial tile is not found.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<tr>
|
||||
<td><code>$base/tile/$H/data/$K[.p/$W]</code></td>
|
||||
<td>
|
||||
Returns the record data for the leaf hashes in
|
||||
<code>/tile/$H/0/$K[.p/$W]</code> (with a literal <code>data</code> path
|
||||
element).
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
If the `go` command consults the checksum database, then the first
|
||||
step is to retrieve the record data through the `/lookup` endpoint. If the
|
||||
module version is not yet recorded in the log, the checksum database will try
|
||||
to fetch it from the origin server before replying. This `/lookup` data
|
||||
provides the sum for this module version as well as its position in the log,
|
||||
which informs the client of which tiles should be fetched to perform proofs.
|
||||
The `go` command performs “inclusion” proofs (that a specific record exists in
|
||||
the log) and “consistency” proofs (that the tree hasn’t been tampered with)
|
||||
before adding new `go.sum` lines to the main module’s `go.sum` file. It's
|
||||
important that the data from `/lookup` should never be used without first
|
||||
authenticating it against the signed tree hash and authenticating the signed
|
||||
tree hash against the client's timeline of signed tree hashes.
|
||||
|
||||
Signed tree hashes and new tiles served by the checksum database are stored
|
||||
in the module cache, so the `go` command only needs to fetch tiles that are
|
||||
missing.
|
||||
|
||||
The `go` command doesn't need to directly connect to the checksum database. It
|
||||
can request module sums via a module proxy that
|
||||
[mirrors the checksum database](https://go.googlesource.com/proposal/+/master/design/25530-sumdb.md#proxying-a-checksum-database)
|
||||
and supports the protocol above. This can be particularly helpful for private,
|
||||
corporate proxies which block requests outside the organization.
|
||||
|
||||
The `GOSUMDB` environment variable identifies the name of checksum database to use
|
||||
and optionally its public key and URL, as in:
|
||||
|
||||
```
|
||||
GOSUMDB="sum.golang.org"
|
||||
GOSUMDB="sum.golang.org+<publickey>"
|
||||
GOSUMDB="sum.golang.org+<publickey> https://sum.golang.org"
|
||||
```
|
||||
|
||||
The `go` command knows the public key of `sum.golang.org`, and also that the
|
||||
name `sum.golang.google.cn` (available inside mainland China) connects to the
|
||||
`sum.golang.org` checksum database; use of any other database requires giving
|
||||
the public key explicitly. The URL defaults to `https://` followed by the
|
||||
database name.
|
||||
|
||||
`GOSUMDB` defaults to `sum.golang.org`, the Go checksum database run by Google.
|
||||
See https://sum.golang.org/privacy for the service's privacy policy.
|
||||
|
||||
If `GOSUMDB` is set to `off`, or if `go get` is invoked with the `-insecure`
|
||||
flag, the checksum database is not consulted, and all unrecognized modules are
|
||||
accepted, at the cost of giving up the security guarantee of verified
|
||||
repeatable downloads for all modules. A better way to bypass the checksum
|
||||
database for specific modules is to use the `GOPRIVATE` or `GONOSUMDB`
|
||||
environment variables. See [Private Modules](#private-modules) for details.
|
||||
|
||||
The `go env -w` command can be used to
|
||||
[set these variables](/pkg/cmd/go/#hdr-Print_Go_environment_information)
|
||||
for future `go` command invocations.
|
||||
|
||||
<a id="privacy"></a>
|
||||
## Privacy
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user