mirror of
https://github.com/golang/go
synced 2024-11-24 19:40:09 -07:00
349 lines
7.0 KiB
HTML
349 lines
7.0 KiB
HTML
|
<!--{
|
||
|
"Title": "Go's Declaration Syntax"
|
||
|
}-->
|
||
|
|
||
|
<p>
|
||
|
Newcomers to Go wonder why the declaration syntax is different from the
|
||
|
tradition established in the C family. In this post we'll compare the
|
||
|
two approaches and explain why Go's declarations look as they do.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
<b>C syntax</b>
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
First, let's talk about C syntax. C took an unusual and clever approach
|
||
|
to declaration syntax. Instead of describing the types with special
|
||
|
syntax, one writes an expression involving the item being declared, and
|
||
|
states what type that expression will have. Thus
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int x;
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
declares x to be an int: the expression 'x' will have type int. In
|
||
|
general, to figure out how to write the type of a new variable, write an
|
||
|
expression involving that variable that evaluates to a basic type, then
|
||
|
put the basic type on the left and the expression on the right.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
Thus, the declarations
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int *p;
|
||
|
int a[3];
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
state that p is a pointer to int because '*p' has type int, and that a
|
||
|
is an array of ints because a[3] (ignoring the particular index value,
|
||
|
which is punned to be the size of the array) has type int.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
What about functions? Originally, C's function declarations wrote the
|
||
|
types of the arguments outside the parens, like this:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int main(argc, argv)
|
||
|
int argc;
|
||
|
char *argv[];
|
||
|
{ /* ... */ }
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
Again, we see that main is a function because the expression main(argc,
|
||
|
argv) returns an int. In modern notation we'd write
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int main(int argc, char *argv[]) { /* ... */ }
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
but the basic structure is the same.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
This is a clever syntactic idea that works well for simple types but can
|
||
|
get confusing fast. The famous example is declaring a function pointer.
|
||
|
Follow the rules and you get this:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int (*fp)(int a, int b);
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
Here, fp is a pointer to a function because if you write the expression
|
||
|
(*fp)(a, b) you'll call a function that returns int. What if one of fp's
|
||
|
arguments is itself a function?
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int (*fp)(int (*ff)(int x, int y), int b)
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
That's starting to get hard to read.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
Of course, we can leave out the name of the parameters when we declare a
|
||
|
function, so main can be declared
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int main(int, char *[])
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
Recall that argv is declared like this,
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
char *argv[]
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
so you drop the name from the <em>middle</em> of its declaration to construct
|
||
|
its type. It's not obvious, though, that you declare something of type
|
||
|
char *[] by putting its name in the middle.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
And look what happens to fp's declaration if you don't name the
|
||
|
parameters:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int (*fp)(int (*)(int, int), int)
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
Not only is it not obvious where to put the name inside
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int (*)(int, int)
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
it's not exactly clear that it's a function pointer declaration at all.
|
||
|
And what if the return type is a function pointer?
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
int (*(*fp)(int (*)(int, int), int))(int, int)
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
It's hard even to see that this declaration is about fp.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
You can construct more elaborate examples but these should illustrate
|
||
|
some of the difficulties that C's declaration syntax can introduce.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
There's one more point that needs to be made, though. Because type and
|
||
|
declaration syntax are the same, it can be difficult to parse
|
||
|
expressions with types in the middle. This is why, for instance, C casts
|
||
|
always parenthesize the type, as in
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
(int)M_PI
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
<b>Go syntax</b>
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
Languages outside the C family usually use a distinct type syntax in
|
||
|
declarations. Although it's a separate point, the name usually comes
|
||
|
first, often followed by a colon. Thus our examples above become
|
||
|
something like (in a fictional but illustrative language)
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
x: int
|
||
|
p: pointer to int
|
||
|
a: array[3] of int
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
These declarations are clear, if verbose - you just read them left to
|
||
|
right. Go takes its cue from here, but in the interests of brevity it
|
||
|
drops the colon and removes some of the keywords:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
x int
|
||
|
p *int
|
||
|
a [3]int
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
There is no direct correspondence between the look of [3]int and how to
|
||
|
use a in an expression. (We'll come back to pointers in the next
|
||
|
section.) You gain clarity at the cost of a separate syntax.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
Now consider functions. Let's transcribe the declaration for main, even
|
||
|
though the main function in Go takes no arguments:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
func main(argc int, argv *[]byte) int
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
Superficially that's not much different from C, but it reads well from
|
||
|
left to right:
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
<em>function main takes an int and a pointer to a slice of bytes and returns an int.</em>
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
Drop the parameter names and it's just as clear - they're always first
|
||
|
so there's no confusion.
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
func main(int, *[]byte) int
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
One value of this left-to-right style is how well it works as the types
|
||
|
become more complex. Here's a declaration of a function variable
|
||
|
(analogous to a function pointer in C):
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
f func(func(int,int) int, int) int
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
Or if f returns a function:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
f func(func(int,int) int, int) func(int, int) int
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
It still reads clearly, from left to right, and it's always obvious
|
||
|
which name is being declared - the name comes first.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
The distinction between type and expression syntax makes it easy to
|
||
|
write and invoke closures in Go:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
sum := func(a, b int) int { return a+b } (3, 4)
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
<b>Pointers</b>
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
Pointers are the exception that proves the rule. Notice that in arrays
|
||
|
and slices, for instance, Go's type syntax puts the brackets on the left
|
||
|
of the type but the expression syntax puts them on the right of the
|
||
|
expression:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
var a []int
|
||
|
x = a[1]
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
For familiarity, Go's pointers use the * notation from C, but we could
|
||
|
not bring ourselves to make a similar reversal for pointer types. Thus
|
||
|
pointers work like this
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
var p *int
|
||
|
x = *p
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
We couldn't say
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
var p *int
|
||
|
x = p*
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
because that postfix * would conflate with multiplication. We could have
|
||
|
used the Pascal ^, for example:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
var p ^int
|
||
|
x = p^
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
and perhaps we should have (and chosen another operator for xor),
|
||
|
because the prefix asterisk on both types and expressions complicates
|
||
|
things in a number of ways. For instance, although one can write
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
[]int("hi")
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
as a conversion, one must parenthesize the type if it starts with a *:
|
||
|
</p>
|
||
|
|
||
|
<pre>
|
||
|
(*int)(nil)
|
||
|
</pre>
|
||
|
|
||
|
<p>
|
||
|
Had we been willing to give up * as pointer syntax, those parentheses
|
||
|
would be unnecessary.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
So Go's pointer syntax is tied to the familiar C form, but those ties
|
||
|
mean that we cannot break completely from using parentheses to
|
||
|
disambiguate types and expressions in the grammar.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
Overall, though, we believe Go's type syntax is easier to understand
|
||
|
than C's, especially when things get complicated.
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
<b>Notes</b>
|
||
|
</p>
|
||
|
|
||
|
<p>
|
||
|
Go's declarations read left to right. It's been pointed out that C's
|
||
|
read in a spiral! See <a href="http://c-faq.com/decl/spiral.anderson.html">
|
||
|
The "Clockwise/Spiral Rule"</a> by David Anderson.
|
||
|
</p>
|