The Go Programming Language ---- (March 7, 2008) This document is an informal specification/proposal for a new systems programming language. Guiding principles ---- Go is a new systems programming language intended as an alternative to C++ at Google. Its main purpose is to provide a productive and efficient programming environment for compiled programs such as servers and distributed systems. The design is motivated by the following guidelines: - very fast compilation (1MLOC/s stretch goal); instantaneous incremental compilation - procedural - strongly typed - concise syntax avoiding repetition - few, orthogonal, and general concepts - support for threading and interprocess communication - garbage collection - container library written in Go - reasonably efficient (C ballpark) The language should be strong enough that the compiler and run time can be written in itself. Modularity, identifiers and scopes ---- A Go program consists of one or more `packages' compiled separately, though not independently. A single package may make individual identifiers visible to other files by marking them as exported; there is no ``header file''. A package collects types, constants, functions, and so on into a named entity that may be exported to enable its constituents be used in another compilation unit. Because there are no header files, all identifiers in a package are either declared explicitly within the package or arise from an import statement. Scoping is essentially the same as in C. Program structure ---- A compilation unit (usually a single source file) consists of a package specifier followed by import declarations followed by other declarations. There are no statements at the top level of a file. A program consists of a number of packages. By convention, one package, by default called Main, is the starting point for execution. It contains a function, also called Main, that is the first function invoked by the run time system. If any package within the program contains a function Init(), that function will be executed before Main.Main() is called. The details of initialization are still under development. Typing, polymorphism, and object-orientation ---- Go programs are strongly typed. Certain expressions, in particular map and channel accesses, can also be polymorphic. The language provides mechanisms to make use of such polymorphic values type-safe. Interface types are the mechanism to support an object-oriented programming style. Different interface types are independent of each other and no explicit hierarchy is required (such as single or multiple inheritance explicitly specified through respective type declarations). Interface types only define a set of methods that a corresponding implementation must provide. Thus interface and implementation are strictly separated. An interface is implemented by associating methods with structures. If a structure implements all methods of an interface, it implements that interface and thus can be used where that interface is required. Unless used through a variable of interface type, methods can always be statically bound (they are not ``virtual''), and incur no runtime overhead compared to an ordinary function. Go has no explicit notion of classes, sub-classes, or inheritance. These concepts are trivially modeled in Go through the use of functions, structures, associated methods, and interfaces. Go has no explicit notion of type parameters or templates. Instead, containers (such as stacks, lists, etc.) are implemented through the use of abstract data types operating on interface types. Pointers and garbage collection ---- Variables may be allocated automatically (when entering the scope of the variable) or explicitly on the heap. Pointers are used to refer to heap-allocated variables. Pointers may also be used to point to any other variable; such a pointer is obtained by "taking the address" of that variable. Variables are automatically reclaimed when they are no longer accessible. There is no pointer arithmetic in Go. Functions ---- Functions contain declarations and statements. They may be recursive. Functions may be anonymous and appear as literals in expressions. Multithreading and channels ---- Go supports multithreaded programming directly. A function may be invoked as a parallel thread of execution. Communication and synchronization is provided through channels and their associated language support. Values and references ---- All objects have value semantics, but its contents may be accessed through different pointers referring to the same object. For example, when calling a function with an array, the array is passed by value, possibly by making a copy. To pass a reference, one must explicitly pass a pointer to the array. For arrays in particular, this is different from C. There is also a built-in string type, which represents immutable byte strings. Syntax ---- The syntax of statements and expressions in Go borrows from the C tradition; declarations are loosely derived from the Pascal tradition to allow more comprehensible composability of types. Here is a complete example Go program that implements a concurrent prime sieve: package Main // Send the sequence 2, 3, 4, ... to channel 'ch'. func Generate(ch *chan> int) { for i := 2; ; i++ { >ch = i; // Send 'i' to channel 'ch'. } } // Copy the values from channel 'in' to channel 'out', // removing those divisible by 'prime'. func Filter(in *chan< int, out *chan> int, prime int) { for { i := out = i; // Send 'i' to channel 'out'. } } } // The prime sieve: Daisy-chain Filter processes together. func Sieve() { ch := new(chan int); // Create a new channel. go Generate(ch); // Start Generate() as a subprocess. for { prime := ' ] ValueType . chan any // a generic channel chan int // a channel that can exchange only ints chan> float // a channel that can only be used to send floats chan< any // a channel that can receive (only) values of any type Channel variables always have type pointer to channel. It is an error to attempt to dereference a channel pointer. There are no channel literals. Function types ---- A function type denotes the set of all functions with the same signature. A method is a function with a receiver, which is of type pointer to struct. Functions can return multiple values simultaneously. FunctionType = 'func' AnonymousSignature . AnonymousSignature = [ Receiver '.' ] Parameters [ Result ] . Receiver = '(' identifier Type ')' . Parameters = '(' [ ParameterList ] ')' . ParameterList = ParameterSection { ',' ParameterSection } . ParameterSection = [ IdentifierList ] Type . Result = Type | '(' ParameterList ')' . // Function types func () func (a, b int, z float) bool func (a, b int, z float) (success bool) func (a, b int, z float) (success bool, result float) // Method types func (p *T) . () func (p *T) . (a, b int, z float) bool func (p *T) . (a, b int, z float) (success bool) func (p *T) . (a, b int, z float) (success bool, result float) A variable can only hold a pointer to a function, but not a function value. In particular, v := func() {}; creates a variable of type *func(). To call the function referenced by v, one writes v(). It is illegal to dereference a function pointer. Function Literals ---- Function literals represent anonymous functions. FunctionLit = FunctionType Block . Block = '{' [ StatementList ] '}' . A function literal can be invoked or assigned to a variable of the corresponding function pointer type. For now, a function literal can reference only its parameters, global variables, and variables declared within the function literal. // Function literal func (a, b int, z float) bool { return a*b < int(z); } // Method literal func (p *T) . (a, b int, z float) bool { return a*b < int(z) + p.x; } Methods ---- A method is a function bound to a particular struct type T. When defined, a method indicates the type of the struct by declaring a receiver of type *T. For instance, given type Point type Point struct { x, y float } the declaration func (p *Point) distance(float scale) float { return scale * (p.x*p.x + p.y*p.y); } creates a method of type Point. Note that methods are not declared within their struct type declaration. They may appear anywhere and may be forward-declared for commentary. When invoked, a method behaves like a function whose first argument is the receiver, but at the call site the receiver is bound to the method using the notation receiver.method() For instance, given a Point variable pt, one may call pt.distance(3.5) Interface of a struct ---- The interface of a struct is defined to be the unordered set of methods associated with that struct. Interface types ---- An interface type denotes a set of methods. InterfaceType = 'interface' '{' { MethodDecl } '}' . MethodDecl = identifier Parameters [ Result ] ';' . // A basic file interface. type File interface { Read(b Buffer) bool; Write(b Buffer) bool; Close(); } Any struct whose interface has, possibly as a subset, the complete set of methods of an interface I is said to implement interface I. For instance, if two struct types S1 and S2 have the methods func (p *T) Read(b Buffer) bool { return ... } func (p *T) Write(b Buffer) bool { return ... } func (p *T) Close() { ... } then the File interface is implemented by both S1 and S2, regardless of what other methods S1 and S2 may have or share. All struct types implement the empty interface: interface {} In general, a struct type implements an arbitrary number of interfaces. For instance, if we have type Lock interface { lock(); unlock(); } and S1 and S2 also implement func (p *T) lock() { ... } func (p *T) unlock() { ... } they implement the Lock interface as well as the File interface. There are no interface literals. Literals ---- Literal = BasicLit | CompoundLit . BasicLit = char_lit | string_lit | int_lit | float_lit . CompoundLit = ArrayLit | MapLit | StructLit | FunctionLit . Declarations ---- A declaration associates a name with a language entity such as a type, constant, variable, or function. Declaration = ConstDecl | TypeDecl | VarDecl | FunctionDecl | ExportDecl . Const declarations ---- A constant declaration gives a name to the value of a constant expression. ConstDecl = 'const' ( ConstSpec | '(' ConstSpecList [ ';' ] ')' ). ConstSpec = identifier [ Type ] '=' Expression . ConstSpecList = ConstSpec { ';' ConstSpec }. const pi float = 3.14159265 const e = 2.718281828 const ( one int = 1; two = 3 ) Type declarations ---- A type declaration introduces a name as a shorthand for a type. In certain situations, such as conversions, it may be necessary to use such a type name. TypeDecl = 'type' ( TypeSpec | '(' TypeSpecList [ ';' ] ')' ). TypeSpec = identifier Type . TypeSpecList = TypeSpec { ';' TypeSpec }. type IntArray [16] int type ( Point struct { x, y float }; Polar Point ) Variable declarations ---- A variable declaration creates a variable and gives it a type and a name. It may optionally give the variable an initial value; in some forms of declaration the type of the initial value defines the type of the variable. VarDecl = 'var' ( VarSpec | '(' VarSpecList [ ';' ] ')' ) | SimpleVarDecl . VarSpec = IdentifierList ( Type [ '=' ExpressionList ] | '=' ExpressionList ) . VarSpecList = VarSpec { ';' VarSpec } . var i int var u, v, w float var k = 0 var x, y float = -1.0, -2.0 var ( i int; u, v = 2.0, 3.0 ) If the expression list is present, it must have the same number of elements as there are variables in the variable specification. The syntax SimpleVarDecl = identifier ':=' Expression . is shorthand for var identifer = Expression. i := 0 f := func() int { return 7; } ch := new(chan int); Also, in some contexts such as if or while statements, this construct can be used to declare local temporary variables. Function and method declarations ---- Functions and methods have a special declaration syntax, slightly different from the type syntax because an identifier must be present in the signature. For now, functions and methods can only be declared at the global level. FunctionDecl = 'func' NamedSignature ( ';' | Block ) . NamedSignature = [ Receiver ] identifier Parameters [ Result ] . func min(x int, y int) int { if x < y { return x; } return y; } func foo (a, b int, z float) bool { return a*b < int(z); } A method is a function that also declares a receiver. func (p *T) foo (a, b int, z float) bool { return a*b < int(z) + p.x; } func (p *Point) Length() float { return Math.sqrt(p.x * p.x + p.y * p.y); } func (p *Point) Scale(factor float) { p.x = p.x * factor; p.y = p.y * factor; } Functions and methods can be forward declared by omitting the body: func foo (a, b int, z float) bool; func (p *T) foo (a, b int, z float) bool; Export declarations ---- Global identifiers may be exported, thus making the exported identifer visible outside the package. Another package may then import the identifier to use it. Export declarations must only appear at the global level of a compilation unit. That is, one can export compilation-unit global identifiers but not, for example, local variables or structure fields. Exporting an identifier makes the identifier visible externally to the package. If the identifier represents a type, the type structure is exported as well. The exported identifiers may appear later in the source than the export directive itself, but it is an error to specify an identifier not declared anywhere in the source file containing the export directive. ExportDecl = 'export' ExportIdentifier { ',' ExportIdentifier } . ExportIdentifier = QualifiedIdent . export sin, cos export Math.abs [ TODO complete this section ] Expressions ---- Expression syntax is based on that of C but with fewer precedence levels. Expression = BinaryExpr | UnaryExpr | PrimaryExpr . BinaryExpr = Expression binary_op Expression . UnaryExpr = unary_op Expression . PrimaryExpr = identifier | Literal | '(' Expression ')' | 'iota' | Call | Conversion | Expression '[' Expression [ ':' Expression ] ']' | Expression '.' identifier . Call = Expression '(' [ ExpressionList ] ')' . Conversion = TypeName '(' [ ExpressionList ] ')' . binary_op = log_op | rel_op | add_op | mul_op . log_op = '||' | '&&' . rel_op = '==' | '!=' | '<' | '<=' | '>' | '>='. add_op = '+' | '-' | '|' | '^'. mul_op = '*' | '/' | '%' | '<<' | '>>' | '&'. unary_op = '+' | '-' | '!' | '^' | '<' | '>' | '*' | '&' . Field selection ('.') binds tightest, followed by indexing ('[]') and then calls and conversions. The remaining precedence levels are as follows (in increasing precedence order): Precedence Operator 1 || 2 && 3 == != < <= > >= 4 + - | ^ 5 * / % << >> & 6 + - ! ^ < > * & (unary) For integer values, / and % satisfy the following relationship: (a / b) * b + a % b == a and (a / b) is "truncated towards zero". There are no implicit type conversions except for constants and literals. In particular, unsigned and signed integers cannot be mixed in an expression without explicit conversion. The shift operators implement arithmetic shifts for signed integers, and logical shifts for unsigned integers. The property of negative shift counts are undefined. Unary '^' corresponds to C '~' (bitwise complement). There is no '->' operator. Given a pointer p to a struct, one writes p.f to access field f of the struct. Similarly. given an array or map pointer, one writes p[i], given a function pointer, one writes p() to call the function. Other operators behave as in C. The 'iota' keyword is discussed in the next section. Primary expressions x 2 (s + ".txt") f(3.1415, true) Point(1, 2) m["foo"] s[i : j + 1] obj.color Math.sin f.p[i].x() General expressions +x 23 + 3*x[i] x <= f() ^a >> b f() || g() x == y + 1 && 0 The constant generator 'iota' ---- Within a declaration, each appearance of the keyword 'iota' represents a successive element of an integer sequence. It is reset to zero whenever the keyword 'const', 'type' or 'var' introduces a new declaration. For instance, 'iota' can be used to construct a set of related constants: const ( enum0 = iota; // sets enum0 to 0, etc. enum1 = iota; enum2 = iota ) const ( a = 1 << iota; // sets a to 1 (iota has been reset) b = 1 << iota; // sets b to 2 c = 1 << iota; // sets c to 4 ) const x = iota; // sets x to 0 const y = iota; // sets y to 0 Statements ---- Statements control execution. Statement = Declaration | SimpleStat | CompoundStat | GoStat | ReturnStat | IfStat | SwitchStat | ForStat | RangeStat | BreakStat | ContinueStat | GotoStat | LabelStat . SimpleStat = ExpressionStat | IncDecStat | Assignment | SimpleVarDecl . Expression statements ---- ExpressionStat = Expression . f(x+y) IncDec statements ---- IncDecStat = Expression ( '++' | '--' ) . a[i]++ Note that ++ and -- are not operators for expressions. Compound statements ---- CompoundStat = '{' { Statement } '}' . { x := 1; f(x); } The scope of an Identifier declared within a compound statement extends from the declaration to the end of the compound statement. Assignments ---- Assignment = SingleAssignment | TupleAssignment | Send . SingleAssignment = PrimaryExpr assign_op Expression . TupleAssignment = PrimaryExprList assign_op ExpressionList . PrimaryExprList = PrimaryExpr { "," PrimaryExpr } . Send = '>' Expression '=' Expression . assign_op = [ add_op | mul_op ] '=' . The left-hand side must be an l-value such as a variable, pointer indirection, or an array indexing. x = 1 *p = f() a[i] = 23 As in C, arithmetic binary operators can be combined with assignments: j <<= 2 A tuple assignment assigns the individual elements of a multi-valued operation, such function evaluation or some channel and map operations, into individual variables. For instance, a tuple assignment such as v1, v2, v3 = e1, e2, e3 assigns the expressions e1, e2, e3 to temporaries and then assigns the temporaries to the variables v1, v2, v3. Thus a, b = b, a exchanges the values of a and b. The tuple assignment x, y = f() calls the function f, which must return 2 values and assigns them to x and y. As a special case, retrieving a value from a map, when written as a two-element tuple assignment, assign a value and a boolean. If the value is present in the map, the value is assigned and the second, boolean variable is set to true. Otherwise, the variable is unchanged, and the boolean value is set to false. value, present = map_var[key] Analogously, receiving a value from a channel can be written as a tuple assignment. value, success = chan_ptr = value In assignments, the type of the expression must match the type of the left-hand side. Go statements ---- A go statement starts the execution of a function as an independent concurrent thread of control within the same address space. Unlike with a function, the next line of the program does not wait for the function to complete. GoStat = 'go' Call . go Server() go func(ch chan> bool) { for ;; { sleep(10); >ch = true; }} (c) Return statements ---- A return statement terminates execution of the containing function and optionally provides a result value or values to the caller. ReturnStat = 'return' [ ExpressionList ] . There are two ways to return values from a function. The first is to explicitly list the return value or values in the return statement: func simple_f () int { return 2; } func complex_f1() (re float, im float) { return -7.0, -4.0; } The second is to provide names for the return values and assign them explicitly in the function; the return statement will then provide no values: func complex_f2() (re float, im float) { re = 7.0; im = 4.0; return; } It is legal to name the return values in the declaration even if the first form of return statement is used: func complex_f2() (re float, im float) { return 7.0, 4.0; } If statements ---- If statements have the traditional form except that the condition need not be parenthesized and the "then" statement must be in brace brackets. IfStat = 'if' [ SimpleVarDecl ';' ] Expression Block [ 'else' Statement ] . if x > 0 { return true; } An if statement may include the declaration of a single temporary variable. The scope of the declared variable extends to the end of the if statement, and the variable is initialized once before the statement is entered. if x := f(); x < y { return x; } else if x > z { return z; } else { return y; } Switch statements ---- Switches provide multi-way execution. SwitchStat = 'switch' [ [ SimpleVarDecl ';' ] [ Expression ] ] '{' { CaseClause } '}' . CaseClause = CaseList { Statement } [ 'fallthrough' ] . CaseList = Case { Case } . Case = ( 'case' ExpressionList | 'default' ) ':' . There can be at most one default case in a switch statement. The 'fallthrough' keyword indicates that the control should flow from the end of this case clause to the first statement of the next clause. The expressions do not need to be constants. They will be evaluated top to bottom until the first successful non-default case is reached. If none matches and there is a default case, the statements of the default case are executed. switch tag { default: s3() case 0, 1: s1() case 2: s2() } A switch statement may include the declaration of a single temporary variable. The scope of the declared variable extends to the end of the switch statement, and the variable is initialized once before the switch is entered. switch x := f(); true { case x < 0: return -x default: return x } Cases do not fall through unless explicitly marked with a 'fallthrough' statement. switch a { case 1: b(); fallthrough case 2: c(); } If the expression is omitted, it is equivalent to 'true'. switch { case x < y: f1(); case x < z: f2(); case x == 4: f3(); } For statements ---- For statements are a combination of the 'for' and 'while' loops of C. ForStat = 'for' [ Condition | ForClause ] Block . ForClause = [ InitStat ] ';' [ Condition ] ';' [ PostStat ] . InitStat = SimpleStat . Condition = Expression . PostStat = SimpleStat . A SimpleStat is a simple statement such as an assignment, a SimpleVarDecl, or an increment or decrement statement. Therefore one may declare a loop variable in the init statement. for i := 0; i < 10; i++ { printf("%d\n", i) } A 'for' statement with just a condition executes until the condition becomes false. Thus it is the same as C 'while' statement. for a < b { a *= 2 } If the condition is absent, it is equivalent to 'true'. for { f() } Range statements ---- Range statements are a special control structure for iterating over the contents of arrays and maps. RangeStat = 'range' IdentifierList ':=' RangeExpression Block . RangeExpression = Expression . A range expression must evaluate to an array, map or string. The identifier list must contain either one or two identifiers. If the range expression is a map, a single identifier is declared to range over the keys of the map; two identifiers range over the keys and corresponding values. For arrays and strings, the behavior is analogous for integer indices (the keys) and array elements (the values). a := [ 1, 2, 3 ]; m := [ "fo" : 2, "foo" : 3, "fooo" : 4 ] range i := a { f(a[i]); } range k, v := m { assert(len(k) == v); } Break statements ---- Within a 'for' or 'switch' statement, a 'break' statement terminates execution of the innermost 'for' or 'switch' statement. BreakStat = 'break' [ identifier ]. If there is an identifier, it must be the label name of an enclosing 'for' or' 'switch' statement, and that is the one whose execution terminates. L: for i < n { switch i { case 5: break L } } Continue statements ---- Within a 'for' loop a continue statement begins the next iteration of the loop at the post statement. ContinueStat = 'continue' [ identifier ]. The optional identifier is analogous to that of a 'break' statement. Goto statements ---- A goto statement transfers control to the corresponding label statement. GotoStat = 'goto' identifier . goto Error Label statement ---- A label statement serves as the target of a 'goto', 'break' or 'continue' statement. LabelStat = identifier ':' . Error: There are various restrictions [TBD] as to where a label statement can be used. Packages ---- Every source file identifies the package to which it belongs. The file must begin with a package clause. PackageClause = 'package' PackageName . package Math Import declarations ---- A program can gain access to exported items from another package through an import declaration: ImportDecl = 'import' [ '.' | PackageName ] PackageFileName . PackageFileName = string_lit . An import statement makes the exported contents of the named package file accessible in this package. In the following discussion, assume we have a package in the file "/lib/math", called package Math, which exports functions sin and cos. In the general form, with an explicit package name, the import statement declares that package name as an identifier whose contents are the exported elements of the imported package. For instance, after import M "/lib/math" the contents of the package /lib/math can be accessed by M.cos, M.sin, etc. In its simplest form, with no package name, the import statement implicitly uses the imported package name itself as the local package name. After import "/lib/math" the contents are accessible by Math.sin, Math.cos. Finally, if instead of a package name the import statement uses an explicit period, the contents of the imported package are added to the current package. After import . "/lib/math" the contents are accessible by sin and cos. In this instance, it is an error if the import introduces name conflicts. Program ---- A program is package clause, optionally followed by import declarations, followed by a series of declarations. Program = PackageClause { ImportDecl } { Declaration } . TODO ---- - TODO: type switch? - TODO: select - TODO: words about slices