The Go Programming Language ---- (April 17, 2008) This document is an informal specification/proposal for a new systems programming language. Guiding principles ---- Go is a new systems programming language intended as an alternative to C++ at Google. Its main purpose is to provide a productive and efficient programming environment for compiled programs such as servers and distributed systems. The design is motivated by the following guidelines: - very fast compilation (1MLOC/s stretch goal); instantaneous incremental compilation - procedural - strongly typed - concise syntax avoiding repetition - few, orthogonal, and general concepts - support for threading and interprocess communication - garbage collection - container library written in Go - reasonably efficient (C ballpark) The language should be strong enough that the compiler and run time can be written in itself. Modularity, identifiers and scopes ---- A Go program consists of one or more `packages' compiled separately, though not independently. A single package may make individual identifiers visible to other files by marking them as exported; there is no ``header file''. A package collects types, constants, functions, and so on into a named entity that may be exported to enable its constituents be used in another compilation unit. Because there are no header files, all identifiers in a package are either declared explicitly within the package or arise from an import statement. Scoping is essentially the same as in C. Program structure ---- A compilation unit (usually a single source file) consists of a package specifier followed by import declarations followed by other declarations. There are no statements at the top level of a file. A program consists of a number of packages. By convention, one package, by default called main, is the starting point for execution. It contains a function, also called main, that is the first function invoked by the run time system. If any package within the program contains a function init(), that function will be executed before main.main() is called. The details of initialization are still under development. Typing, polymorphism, and object-orientation ---- Go programs are strongly typed. Certain values can also be polymorphic. The language provides mechanisms to make use of such polymorphic values type-safe. Interface types, building on structures with methods, provide the mechanisms to support object-oriented programming. Different interface types are independent of each other and no explicit hierarchy is required (such as single or multiple inheritance explicitly specified through respective type declarations). Interface types only define a set of methods that a corresponding implementation must provide. Thus interface and implementation are strictly separated. An interface is implemented by associating methods with structures. If a structure implements all methods of an interface, it implements that interface and thus can be used where that interface is required. Unless used through a variable of interface type, methods can always be statically bound (they are not ``virtual''), and incur no runtime overhead compared to an ordinary function. Go has no explicit notion of classes, sub-classes, or inheritance. These concepts are trivially modeled in Go through the use of functions, structures, associated methods, and interfaces. Go has no explicit notion of type parameters or templates. Instead, containers (such as stacks, lists, etc.) are implemented through the use of abstract operations on interface types or polymorphic values. Pointers and garbage collection ---- Variables may be allocated automatically (when entering the scope of the variable) or explicitly on the heap. Pointers are used to refer to heap-allocated variables. Pointers may also be used to point to any other variable; such a pointer is obtained by "taking the address" of that variable. Variables are automatically reclaimed when they are no longer accessible. There is no pointer arithmetic in Go. Functions ---- Functions contain declarations and statements. They may be recursive. Functions may be anonymous and appear as literals in expressions. Multithreading and channels ---- Go supports multithreaded programming directly. A function may be invoked as a parallel thread of execution. Communication and synchronization is provided through channels and their associated language support. Values and references ---- All objects have value semantics, but their contents may be accessed through different pointers referring to the same object. For example, when calling a function with an array, the array is passed by value, possibly by making a copy. To pass a reference, one must explicitly pass a pointer to the array. For arrays in particular, this is different from C. There is also a built-in string type, which represents immutable strings of bytes. Syntax ---- The syntax of statements and expressions in Go borrows from the C tradition; declarations are loosely derived from the Pascal tradition to allow more comprehensible composability of types. Here is a complete example Go program that implements a concurrent prime sieve: package main // Send the sequence 2, 3, 4, ... to channel 'ch'. func Generate(ch *chan> int) { for i := 2; ; i++ { >ch = i // Send 'i' to channel 'ch'. } } // Copy the values from channel 'in' to channel 'out', // removing those divisible by 'prime'. func Filter(in *chan< int, out *chan> int, prime int) { for { i := out = i // Send 'i' to channel 'out'. } } } // The prime sieve: Daisy-chain Filter processes together. func Sieve() { ch := new(chan int); // Create a new channel. go Generate(ch); // Start Generate() as a subprocess. for { prime := " ] ValueType . chan any // a generic channel chan int // a channel that can exchange only ints chan> float // a channel that can only be used to send floats chan< any // a channel that can receive (only) values of any type Channel variables always have type pointer to channel. It is an error to attempt to use a channel value and in particular to dereference a channel pointer. var ch *chan int; ch = new(chan int); // new returns type *chan int There are no channel literals. Function types ---- A function type denotes the set of all functions with the same signature. A method is a function with a receiver, which is of type pointer to struct. Functions can return multiple values simultaneously. FunctionType = "func" AnonymousSignature . AnonymousSignature = [ Receiver "." ] Parameters [ Result ] . Receiver = "(" identifier Type ")" . Parameters = "(" [ ParameterList ] ")" . ParameterList = ParameterSection { "," ParameterSection } . ParameterSection = [ IdentifierList ] Type . Result = Type | "(" ParameterList ")" . // Function types func () func (a, b int, z float) bool func (a, b int, z float) (success bool) func (a, b int, z float) (success bool, result float) // Method types func (p *T) . () func (p *T) . (a, b int, z float) bool func (p *T) . (a, b int, z float) (success bool) func (p *T) . (a, b int, z float) (success bool, result float) A variable can hold only a pointer to a function, not a function value. In particular, v := func() {} creates a variable of type *func(). To call the function referenced by v, one writes v(). It is illegal to dereference a function pointer. Function Literals ---- Function literals represent anonymous functions. FunctionLit = FunctionType Block . Block = "{" [ StatementList [ ";" ] ] "}" . The scope of an identifier declared within a block extends from the declaration of the identifier (that is, the position immediately after the identifier) to the end of the block. A function literal can be invoked or assigned to a variable of the corresponding function pointer type. For now, a function literal can reference only its parameters, global variables, and variables declared within the function literal. // Function literal func (a, b int, z float) bool { return a*b < int(z); } // Method literal func (p *T) . (a, b int, z float) bool { return a*b < int(z) + p.x; } Unresolved issues: Are there method literals? How do you use them? Methods ---- A method is a function bound to a particular struct type T. When defined, a method indicates the type of the struct by declaring a receiver of type *T. For instance, given type Point type Point struct { x, y float } the declaration func (p *Point) distance(float scale) float { return scale * (p.x*p.x + p.y*p.y); } creates a method of type Point. Note that methods are not declared within their struct type declaration. They may appear anywhere and may be forward-declared for commentary. When invoked, a method behaves like a function whose first argument is the receiver, but at the call site the receiver is bound to the method using the notation receiver.method() For instance, given a Point variable pt, one may call pt.distance(3.5) Interface of a struct ---- The interface of a struct is defined to be the unordered set of methods associated with that struct. Interface types ---- An interface type denotes a set of methods. InterfaceType = "interface" "{" [ MethodDeclList [ ";" ] ] "}" . MethodDeclList = MethodDecl { ";" MethodDecl } . MethodDecl = identifier Parameters [ Result ] . // A basic file interface. type File interface { Read(b Buffer) bool; Write(b Buffer) bool; Close(); } Any struct whose interface has, possibly as a subset, the complete set of methods of an interface I is said to implement interface I. For instance, if two struct types S1 and S2 have the methods func (p *T) Read(b Buffer) bool { return ... } func (p *T) Write(b Buffer) bool { return ... } func (p *T) Close() { ... } then the File interface is implemented by both S1 and S2, regardless of what other methods S1 and S2 may have or share. All struct types implement the empty interface: interface {} In general, a struct type implements an arbitrary number of interfaces. For instance, if we have type Lock interface { lock(); unlock(); } and S1 and S2 also implement func (p *T) lock() { ... } func (p *T) unlock() { ... } they implement the Lock interface as well as the File interface. It is legal to assign a pointer to a struct to a variable of compatible interface type. It is legal to assign an interface variable to any struct pointer variable but if the struct type is incompatible the result will be nil. There are no interface literals. The polymorphic "any" type ---- Given a variable of type "any", one can store any value into it by plain assignment or implicitly, such as through a function parameter or channel operation. Given an "any" variable v storing an underlying value of type T, one may: - copy v's value to another variable of type "any" - extract the stored value by an explicit conversion operation T(v) - copy v's value to a variable of type T Attempts to convert/extract to an incompatible type will yield nil. No other operations are defined (yet). Note that type interface {} is a special case that can match any struct type, while type any can match any type at all, including basic types, arrays, etc. TODO: details about reflection Literals ---- Literal = BasicLit | CompoundLit . BasicLit = char_lit | string_lit | int_lit | float_lit | "nil" . CompoundLit = ArrayLit | MapLit | StructLit | FunctionLit . Declarations ---- A declaration associates a name with a language entity such as a type, constant, variable, or function. Declaration = ConstDecl | TypeDecl | VarDecl | FunctionDecl | ExportDecl . Const declarations ---- A constant declaration gives a name to the value of a constant expression. ConstDecl = "const" ( ConstSpec | "(" ConstSpecList [ ";" ] ")" ). ConstSpec = identifier [ Type ] "=" Expression . ConstSpecList = ConstSpec { ";" ConstSpec }. const pi float = 3.14159265 const e = 2.718281828 const ( one int = 1; two = 3 ) Type declarations ---- A type declaration introduces a name as a shorthand for a type. TypeDecl = "type" ( TypeSpec | "(" TypeSpecList [ ";" ] ")" ). TypeSpec = identifier Type . TypeSpecList = TypeSpec { ";" TypeSpec }. type IntArray [16] int type ( Point struct { x, y float }; Polar Point ) Variable declarations ---- A variable declaration creates a variable and gives it a type and a name. It may optionally give the variable an initial value; in some forms of declaration the type of the initial value defines the type of the variable. VarDecl = "var" ( VarSpec | "(" VarSpecList [ ";" ] ")" ) . VarSpec = IdentifierList ( Type [ "=" ExpressionList ] | "=" ExpressionList ) . VarSpecList = VarSpec { ";" VarSpec } . var i int var u, v, w float var k = 0 var x, y float = -1.0, -2.0 var ( i int; u, v = 2.0, 3.0 ) If the expression list is present, it must have the same number of elements as there are variables in the variable specification. The syntax SimpleVarDecl = identifier ":=" Expression . is shorthand for var identifer = Expression. i := 0 f := func() int { return 7; } ch := new(chan int); Also, in some contexts such as if or for statements, this construct can be used to declare local temporary variables. Function and method declarations ---- Functions and methods have a special declaration syntax, slightly different from the type syntax because an identifier must be present in the signature. Functions and methods can only be declared at the global level. FunctionDecl = "func" NamedSignature ( ";" | Block ) . NamedSignature = [ Receiver ] identifier Parameters [ Result ] . func min(x int, y int) int { if x < y { return x; } return y; } func foo(a, b int, z float) bool { return a*b < int(z); } A method is a function that also declares a receiver. func (p *T) foo(a, b int, z float) bool { return a*b < int(z) + p.x; } func (p *Point) Length() float { return Math.sqrt(p.x * p.x + p.y * p.y); } func (p *Point) Scale(factor float) { p.x = p.x * factor; p.y = p.y * factor; } Functions and methods can be forward declared by omitting the body: func foo(a, b int, z float) bool; func (p *T) foo(a, b int, z float) bool; Initial values ---- When memory is allocated to store a value, either through a declaration or new(), and no explicit initialization is provided, the memory is given a default initialization. Each element of such a value is set to the ``zero'' for that type: 0 for integers, 0.0 for floats, and nil for pointers. This intialization is done recursively, so for instance each element of an array of integers will be set to 0 if no other value is specified. These two simple declarations are equivalent: var i int; var i int = 0; After type T struct { i int; f float; next *T }; t := new(T); the following holds: t.i == 0 t.f == 0.0 t.next == nil Export declarations ---- Global identifiers may be exported, thus making the exported identifer visible outside the package. Another package may then import the identifier to use it. Export declarations must only appear at the global level of a compilation unit and can name only globally-visible identifiers. That is, one can export global functions, types, and so on but not local variables or structure fields. Exporting an identifier makes the identifier visible externally to the package. If the identifier represents a type, the type structure is exported as well. The exported identifiers may appear later in the source than the export directive itself, but it is an error to specify an identifier not declared anywhere in the source file containing the export directive. ExportDecl = "export" ExportIdentifier { "," ExportIdentifier } . ExportIdentifier = QualifiedIdent . export sin, cos export math.abs TODO: complete this section TODO: export as a mechanism for public and private struct fields? Expressions ---- Expression syntax is based on that of C but with fewer precedence levels. Expression = BinaryExpr | UnaryExpr | PrimaryExpr . BinaryExpr = Expression binary_op Expression . UnaryExpr = unary_op Expression . PrimaryExpr = identifier | Literal | "(" Expression ")" | "iota" | Call | Conversion | Allocation | Expression "[" Expression [ ":" Expression ] "]" | Expression "." identifier | Expression "." "(" Type ")" . Call = Expression "(" [ ExpressionList ] ")" . Conversion = TypeName "(" [ ExpressionList ] ")" . Allocation = "new" "(" Type [ "," Expression ] ")" . binary_op = log_op | rel_op | add_op | mul_op . log_op = "||" | "&&" . rel_op = "==" | "!=" | "<" | "<=" | ">" | ">=". add_op = "+" | "-" | "|" | "^". mul_op = "*" | "/" | "%" | "<<" | ">>" | "&". unary_op = "+" | "-" | "!" | "^" | "<" | ">" | "*" | "&" . Field selection and type assertions ('.') bind tightest, followed by indexing ('[]') and then calls and conversions. The remaining precedence levels are as follows (in increasing precedence order): Precedence Operator 1 || 2 && 3 == != < <= > >= 4 + - | ^ 5 * / % << >> & 6 + - ! ^ < > * & (unary) For integer values, / and % satisfy the following relationship: (a / b) * b + a % b == a and (a / b) is "truncated towards zero". There are no implicit type conversions except for constants and literals. In particular, unsigned and signed integer variables cannot be mixed in an expression without explicit conversion. The shift operators implement arithmetic shifts for signed integers and logical shifts for unsigned integers. The properties of negative shift counts are undefined. Unary '^' corresponds to C '~' (bitwise complement). There is no '->' operator. Given a pointer p to a struct, one writes p.f to access field f of the struct. Similarly, given an array or map pointer, one writes p[i] to access an element. Given a function pointer, one writes p() to call the function. Other operators behave as in C. The "iota" keyword is discussed in a later section. Examples of primary expressions x 2 (s + ".txt") f(3.1415, true) Point(1, 2) new([]int, 100) m["foo"] s[i : j + 1] obj.color Math.sin f.p[i].x() Examples of general expressions +x 23 + 3*x[i] x <= f() ^a >> b f() || g() x == y + 1 && 0 The nil value ---- The keyword nil represents the ``zero'' value for a pointer type or interface type. The only operations allowed for nil are to assign it to a pointer or interface value and to compare it for equality or inquality with a pointer or interface value. var p *int; if p != nil { print p } else { print "p points nowhere" } By default, pointers are initialized to nil. TODO: how does this definition jibe with using nil to specify conversion failure if the result is not of pointer type, such as an any variable holding an int? Allocation ---- The builtin-function new() allocates storage. The function takes a parenthesized operand list comprising the type of the value to allocate, optionally followed by type-specific expressions that influence the allocation. The invocation returns a pointer to the memory. The memory is initialized as described in the section on initial values. For instance, type S struct { a int; b float } new(int32) allocates storage for an S, initializes it (a=0, b=0.0), and returns a value of type *S pointing to that storage. The only defined parameters affect sizes for allocating arrays, buffered channels, and maps. ap := new([]int, 10); # a pointer to an array of 10 ints aap := new([][]int, 5, 10); # a pointer to an array of 5 arrays of 10 ints c := new(chan int, 10); # a pointer to a channel with a buffer size of 10 m := new(map[string] int, 100); # a pointer to a map with space for 100 elements preallocated TODO: argument order for dimensions in multidimensional arrays The constant generator 'iota' ---- Within a declaration, each appearance of the keyword 'iota' represents a successive element of an integer sequence. It is reset to zero whenever the keyword 'const', 'type' or 'var' introduces a new declaration. For instance, 'iota' can be used to construct a set of related constants: const ( enum0 = iota; // sets enum0 to 0, etc. enum1 = iota; enum2 = iota ) const ( a = 1 << iota; // sets a to 1 (iota has been reset) b = 1 << iota; // sets b to 2 c = 1 << iota; // sets c to 4 ) const x = iota; // sets x to 0 const y = iota; // sets y to 0 Statements ---- Statements control execution. Statement = [ LabelDecl ] ( StructuredStat | UnstructuredStat ) . StructuredStat = Block | IfStat | SwitchStat | SelectStat | ForStat | RangeStat . UnstructuredStat = Declaration | SimpleVarDecl | SimpleStat | GoStat | ReturnStat | BreakStat | ContinueStat | GotoStat . SimpleStat = ExpressionStat | IncDecStat | Assignment | SimpleVarDecl . Statement lists ---- Semicolons are used to separate individual statements of a statement list. They are optional after a statement that ends with a closing curly brace '}'. StatementList = StructuredStat | UnstructuredStat | StructuredStat [ ";" ] StatementList | UnstructuredStat ";" StatementList . TODO: define optional semicolons precisely Expression statements ---- ExpressionStat = Expression . f(x+y) IncDec statements ---- IncDecStat = Expression ( "++" | "--" ) . a[i]++ Note that ++ and -- are not operators for expressions. Assignments ---- Assignment = SingleAssignment | TupleAssignment | Send . SingleAssignment = PrimaryExpr assign_op Expression . TupleAssignment = PrimaryExprList assign_op ExpressionList . PrimaryExprList = PrimaryExpr { "," PrimaryExpr } . Send = ">" Expression "=" Expression . assign_op = [ add_op | mul_op ] "=" . The left-hand side must be an l-value such as a variable, pointer indirection, or an array indexing. x = 1 *p = f() a[i] = 23 As in C, arithmetic binary operators can be combined with assignments: j <<= 2 A tuple assignment assigns the individual elements of a multi-valued operation, such as function evaluation or some channel and map operations, into individual variables. For instance, a tuple assignment such as v1, v2, v3 = e1, e2, e3 assigns the expressions e1, e2, e3 to temporaries and then assigns the temporaries to the variables v1, v2, v3. Thus a, b = b, a exchanges the values of a and b. The tuple assignment x, y = f() calls the function f, which must return two values, and assigns them to x and y. As a special case, retrieving a value from a map, when written as a two-element tuple assignment, assign a value and a boolean. If the value is present in the map, the value is assigned and the second, boolean variable is set to true. Otherwise, the variable is unchanged, and the boolean value is set to false. value, present = map_var[key] Analogously, receiving a value from a channel can be written as a tuple assignment. value, success = chan_ptr = value In assignments, the type of the expression must match the type of the left-hand side. Go statements ---- A go statement starts the execution of a function as an independent concurrent thread of control within the same address space. Unlike with a function, the next line of the program does not wait for the function to complete. GoStat = "go" Call . go Server() go func(ch chan> bool) { for { sleep(10); >ch = true; }} (c) Return statements ---- A return statement terminates execution of the containing function and optionally provides a result value or values to the caller. ReturnStat = "return" [ ExpressionList ] . There are two ways to return values from a function. The first is to explicitly list the return value or values in the return statement: func simple_f() int { return 2; } func complex_f1() (float, float) { return -7.0, -4.0; } The second is to provide names for the return values and assign them explicitly in the function; the return statement will then provide no values: func complex_f2() (re float, im float) { re = 7.0; im = 4.0; return; } It is legal to name the return values in the declaration even if the first form of return statement is used: func complex_f2() (re float, im float) { return 7.0, 4.0; } If statements ---- If statements have the traditional form except that the condition need not be parenthesized and the "then" statement must be in brace brackets. IfStat = "if" [ SimpleStat ";" ] Expression Block [ "else" Statement ] . if x > 0 { return true; } An if statement may include the declaration of a single temporary variable. The scope of the declared variable extends to the end of the if statement, and the variable is initialized once before the statement is entered. if x := f(); x < y { return x; } else if x > z { return z; } else { return y; } Switch statements ---- Switches provide multi-way execution. SwitchStat = "switch" [ [ SimpleStat ";" ] "Expression ] "{" { CaseClause } "}" . CaseClause = CaseList StatementList [ ";" ] [ "fallthrough" [ ";" ] ] . CaseList = Case { Case } . Case = ( "case" ExpressionList | "default" ) ":" . There can be at most one default case in a switch statement. The "fallthrough" keyword indicates that the control should flow from the end of this case clause to the first statement of the next clause. The expressions do not need to be constants. They will be evaluated top to bottom until the first successful non-default case is reached. If none matches and there is a default case, the statements of the default case are executed. switch tag { default: s3() case 0, 1: s1() case 2: s2() } A switch statement may include the declaration of a single temporary variable. The scope of the declared variable extends to the end of the switch statement, and the variable is initialized once before the switch is entered. switch x := f(); true { case x < 0: return -x default: return x } Cases do not fall through unless explicitly marked with a "fallthrough" statement. switch a { case 1: b(); fallthrough case 2: c(); } If the expression is omitted, it is equivalent to "true". switch { case x < y: f1(); case x < z: f2(); case x == 4: f3(); } Select statements ---- A select statement chooses which of a set of possible communications will proceed. It looks similar to a switch statement but with the cases all referring to communication operations. SelectStat = "select" "{" { CommClause } "}" . CommClause = CommCase { Statement } . CommCase = ( "default" | ( "case" ( SendCase | RecvCase) ) ) ":" . SendCase = Send . RecvCase = [ identifier '=' ] RecvExpression . RecvExpression = '<' Expression . The select statement evaluates all the channel (pointers) involved. If any of the channels can proceed, the corresponding communication and statements are evaluated. Otherwise, if there is a default case, that executes; if not, the statement blocks until one of the communications can complete. A channel pointer may be nil, which is equivalent to that case not being present in the select statement. If the channel sends or receives "any" or an interface type, its communication can proceed only if the type of the communication clause matches that of the dynamic value to be exchanged. If multiple cases can proceed, a uniform fair choice is made regarding which single communication will execute. var c, c1, c2 *chan int; select { case i1 = c2 = i2: printf("sent %d to c2\n", i2); default: printf("no communication\n"); } for { // send random sequence of bits to c select { case >c = 0: // note: no statement, no fallthrough, no folding of cases case >c = 1: } } var ca *chan any; var i int; var f float; select { case i =