1
0
mirror of https://github.com/golang/go synced 2024-11-27 04:11:22 -07:00
go/doc/articles/wiki/index.html

742 lines
22 KiB
HTML
Raw Normal View History

<!--{
"Title": "Writing Web Applications",
"Template": true
}-->
<h2>Introduction</h2>
<p>
Covered in this tutorial:
</p>
<ul>
<li>Creating a data structure with load and save methods</li>
<li>Using the <code>net/http</code> package to build web applications
<li>Using the <code>html/template</code> package to process HTML templates</li>
<li>Using the <code>regexp</code> package to validate user input</li>
<li>Using closures</li>
</ul>
<p>
Assumed knowledge:
</p>
<ul>
<li>Programming experience</li>
<li>Understanding of basic web technologies (HTTP, HTML)</li>
<li>Some UNIX/DOS command-line knowledge</li>
</ul>
<h2>Getting Started</h2>
<p>
At present, you need to have a FreeBSD, Linux, OS X, or Windows machine to run Go.
We will use <code>$</code> to represent the command prompt.
</p>
<p>
Install Go (see the <a href="/doc/install">Installation Instructions</a>).
</p>
<p>
Make a new directory for this tutorial inside your <code>GOPATH</code> and cd to it:
</p>
<pre>
$ mkdir gowiki
$ cd gowiki
</pre>
<p>
Create a file named <code>wiki.go</code>, open it in your favorite editor, and
add the following lines:
</p>
<pre>
package main
import (
"fmt"
"io/ioutil"
)
</pre>
<p>
We import the <code>fmt</code> and <code>ioutil</code> packages from the Go
standard library. Later, as we implement additional functionality, we will
add more packages to this <code>import</code> declaration.
</p>
<h2>Data Structures</h2>
<p>
Let's start by defining the data structures. A wiki consists of a series of
interconnected pages, each of which has a title and a body (the page content).
Here, we define <code>Page</code> as a struct with two fields representing
the title and body.
</p>
{{code "doc/articles/wiki/part1.go" `/^type Page/` `/}/`}}
<p>
The type <code>[]byte</code> means "a <code>byte</code> slice".
(See <a href="/doc/articles/slices_usage_and_internals.html">Slices: usage and
internals</a> for more on slices.)
The <code>Body</code> element is a <code>[]byte</code> rather than
<code>string</code> because that is the type expected by the <code>io</code>
libraries we will use, as you'll see below.
</p>
<p>
The <code>Page</code> struct describes how page data will be stored in memory.
But what about persistent storage? We can address that by creating a
<code>save</code> method on <code>Page</code>:
</p>
{{code "doc/articles/wiki/part1.go" `/^func.*Page.*save/` `/}/`}}
<p>
This method's signature reads: "This is a method named <code>save</code> that
takes as its receiver <code>p</code>, a pointer to <code>Page</code> . It takes
no parameters, and returns a value of type <code>error</code>."
</p>
<p>
This method will save the <code>Page</code>'s <code>Body</code> to a text
file. For simplicity, we will use the <code>Title</code> as the file name.
</p>
<p>
The <code>save</code> method returns an <code>error</code> value because
that is the return type of <code>WriteFile</code> (a standard library function
that writes a byte slice to a file). The <code>save</code> method returns the
error value, to let the application handle it should anything go wrong while
writing the file. If all goes well, <code>Page.save()</code> will return
<code>nil</code> (the zero-value for pointers, interfaces, and some other
types).
</p>
<p>
The octal integer literal <code>0600</code>, passed as the third parameter to
<code>WriteFile</code>, indicates that the file should be created with
read-write permissions for the current user only. (See the Unix man page
<code>open(2)</code> for details.)
</p>
<p>
In addition to saving pages, we will want to load pages, too:
</p>
{{code "doc/articles/wiki/part1-noerror.go" `/^func loadPage/` `/^}/`}}
<p>
The function <code>loadPage</code> constructs the file name from the title
parameter, reads the file's contents into a new variable <code>body</code>, and
returns a pointer to a <code>Page</code> literal constructed with the proper
title and body values.
</p>
<p>
Functions can return multiple values. The standard library function
<code>io.ReadFile</code> returns <code>[]byte</code> and <code>error</code>.
In <code>loadPage</code>, error isn't being handled yet; the "blank identifier"
represented by the underscore (<code>_</code>) symbol is used to throw away the
error return value (in essence, assigning the value to nothing).
</p>
<p>
But what happens if <code>ReadFile</code> encounters an error? For example,
the file might not exist. We should not ignore such errors. Let's modify the
function to return <code>*Page</code> and <code>error</code>.
</p>
{{code "doc/articles/wiki/part1.go" `/^func loadPage/` `/^}/`}}
<p>
Callers of this function can now check the second parameter; if it is
<code>nil</code> then it has successfully loaded a Page. If not, it will be an
<code>error</code> that can be handled by the caller (see the
<a href="/ref/spec#Errors">language specification</a> for details).
</p>
<p>
At this point we have a simple data structure and the ability to save to and
load from a file. Let's write a <code>main</code> function to test what we've
written:
</p>
{{code "doc/articles/wiki/part1.go" `/^func main/` `/^}/`}}
<p>
After compiling and executing this code, a file named <code>TestPage.txt</code>
would be created, containing the contents of <code>p1</code>. The file would
then be read into the struct <code>p2</code>, and its <code>Body</code> element
printed to the screen.
</p>
<p>
You can compile and run the program like this:
</p>
<pre>
$ go build wiki.go
$ ./wiki
This is a sample Page.
</pre>
<p>
(If you're using Windows you must type "<code>wiki</code>" without the
"<code>./</code>" to run the program.)
</p>
<p>
<a href="part1.go">Click here to view the code we've written so far.</a>
</p>
<h2>Introducing the <code>net/http</code> package (an interlude)</h2>
<p>
Here's a full working example of a simple web server:
</p>
{{code "doc/articles/wiki/http-sample.go"}}
<p>
The <code>main</code> function begins with a call to
<code>http.HandleFunc</code>, which tells the <code>http</code> package to
handle all requests to the web root (<code>"/"</code>) with
<code>handler</code>.
</p>
<p>
It then calls <code>http.ListenAndServe</code>, specifying that it should
listen on port 8080 on any interface (<code>":8080"</code>). (Don't
worry about its second parameter, <code>nil</code>, for now.)
This function will block until the program is terminated.
</p>
<p>
<code>ListenAndServe</code> always returns an error, since it only returns when an
unexpected error occurs.
In order to log that error we wrap the function call with <code>log.Fatal</code>.
</p>
<p>
The function <code>handler</code> is of the type <code>http.HandlerFunc</code>.
It takes an <code>http.ResponseWriter</code> and an <code>http.Request</code> as
its arguments.
</p>
<p>
An <code>http.ResponseWriter</code> value assembles the HTTP server's response; by writing
to it, we send data to the HTTP client.
</p>
<p>
An <code>http.Request</code> is a data structure that represents the client
HTTP request. <code>r.URL.Path</code> is the path component
of the request URL. The trailing <code>[1:]</code> means
"create a sub-slice of <code>Path</code> from the 1st character to the end."
This drops the leading "/" from the path name.
</p>
<p>
If you run this program and access the URL:
</p>
<pre>http://localhost:8080/monkeys</pre>
<p>
the program would present a page containing:
</p>
<pre>Hi there, I love monkeys!</pre>
<h2>Using <code>net/http</code> to serve wiki pages</h2>
<p>
To use the <code>net/http</code> package, it must be imported:
</p>
<pre>
import (
"fmt"
"io/ioutil"
"log"
<b>"net/http"</b>
)
</pre>
<p>
Let's create a handler, <code>viewHandler</code> that will allow users to
view a wiki page. It will handle URLs prefixed with "/view/".
</p>
{{code "doc/articles/wiki/part2.go" `/^func viewHandler/` `/^}/`}}
<p>
Again, note the use of <code>_</code> to ignore the <code>error</code>
return value from <code>loadPage</code>. This is done here for simplicity
and generally considered bad practice. We will attend to this later.
</p>
<p>
First, this function extracts the page title from <code>r.URL.Path</code>,
the path component of the request URL.
The <code>Path</code> is re-sliced with <code>[len("/view/"):]</code> to drop
the leading <code>"/view/"</code> component of the request path.
This is because the path will invariably begin with <code>"/view/"</code>,
which is not part of the page's title.
</p>
<p>
The function then loads the page data, formats the page with a string of simple
HTML, and writes it to <code>w</code>, the <code>http.ResponseWriter</code>.
</p>
<p>
To use this handler, we rewrite our <code>main</code> function to
initialize <code>http</code> using the <code>viewHandler</code> to handle
any requests under the path <code>/view/</code>.
</p>
{{code "doc/articles/wiki/part2.go" `/^func main/` `/^}/`}}
<p>
<a href="part2.go">Click here to view the code we've written so far.</a>
</p>
<p>
Let's create some page data (as <code>test.txt</code>), compile our code, and
try serving a wiki page.
</p>
<p>
Open <code>test.txt</code> file in your editor, and save the string "Hello world" (without quotes)
in it.
</p>
<pre>
$ go build wiki.go
$ ./wiki
</pre>
<p>
(If you're using Windows you must type "<code>wiki</code>" without the
"<code>./</code>" to run the program.)
</p>
<p>
With this web server running, a visit to <code><a
href="http://localhost:8080/view/test">http://localhost:8080/view/test</a></code>
should show a page titled "test" containing the words "Hello world".
</p>
<h2>Editing Pages</h2>
<p>
A wiki is not a wiki without the ability to edit pages. Let's create two new
handlers: one named <code>editHandler</code> to display an 'edit page' form,
and the other named <code>saveHandler</code> to save the data entered via the
form.
</p>
<p>
First, we add them to <code>main()</code>:
</p>
{{code "doc/articles/wiki/final-noclosure.go" `/^func main/` `/^}/`}}
<p>
The function <code>editHandler</code> loads the page
(or, if it doesn't exist, create an empty <code>Page</code> struct),
and displays an HTML form.
</p>
{{code "doc/articles/wiki/notemplate.go" `/^func editHandler/` `/^}/`}}
<p>
This function will work fine, but all that hard-coded HTML is ugly.
Of course, there is a better way.
</p>
<h2>The <code>html/template</code> package</h2>
<p>
The <code>html/template</code> package is part of the Go standard library.
We can use <code>html/template</code> to keep the HTML in a separate file,
allowing us to change the layout of our edit page without modifying the
underlying Go code.
</p>
<p>
First, we must add <code>html/template</code> to the list of imports. We
also won't be using <code>fmt</code> anymore, so we have to remove that.
</p>
<pre>
import (
<b>"html/template"</b>
"io/ioutil"
"net/http"
)
</pre>
<p>
Let's create a template file containing the HTML form.
Open a new file named <code>edit.html</code>, and add the following lines:
</p>
{{code "doc/articles/wiki/edit.html"}}
<p>
Modify <code>editHandler</code> to use the template, instead of the hard-coded
HTML:
</p>
{{code "doc/articles/wiki/final-noerror.go" `/^func editHandler/` `/^}/`}}
<p>
The function <code>template.ParseFiles</code> will read the contents of
<code>edit.html</code> and return a <code>*template.Template</code>.
</p>
<p>
The method <code>t.Execute</code> executes the template, writing the
generated HTML to the <code>http.ResponseWriter</code>.
The <code>.Title</code> and <code>.Body</code> dotted identifiers refer to
<code>p.Title</code> and <code>p.Body</code>.
</p>
<p>
Template directives are enclosed in double curly braces.
The <code>printf "%s" .Body</code> instruction is a function call
that outputs <code>.Body</code> as a string instead of a stream of bytes,
the same as a call to <code>fmt.Printf</code>.
The <code>html/template</code> package helps guarantee that only safe and
correct-looking HTML is generated by template actions. For instance, it
automatically escapes any greater than sign (<code>&gt;</code>), replacing it
with <code>&amp;gt;</code>, to make sure user data does not corrupt the form
HTML.
</p>
<p>
Since we're working with templates now, let's create a template for our
<code>viewHandler</code> called <code>view.html</code>:
</p>
{{code "doc/articles/wiki/view.html"}}
<p>
Modify <code>viewHandler</code> accordingly:
</p>
{{code "doc/articles/wiki/final-noerror.go" `/^func viewHandler/` `/^}/`}}
<p>
Notice that we've used almost exactly the same templating code in both
handlers. Let's remove this duplication by moving the templating code
to its own function:
</p>
{{code "doc/articles/wiki/final-template.go" `/^func renderTemplate/` `/^}/`}}
<p>
And modify the handlers to use that function:
</p>
{{code "doc/articles/wiki/final-template.go" `/^func viewHandler/` `/^}/`}}
{{code "doc/articles/wiki/final-template.go" `/^func editHandler/` `/^}/`}}
<p>
If we comment out the registration of our unimplemented save handler in
<code>main</code>, we can once again build and test our program.
<a href="part3.go">Click here to view the code we've written so far.</a>
</p>
<h2>Handling non-existent pages</h2>
<p>
What if you visit <a href="http://localhost:8080/view/APageThatDoesntExist">
<code>/view/APageThatDoesntExist</code></a>? You'll see a page containing
HTML. This is because it ignores the error return value from
<code>loadPage</code> and continues to try and fill out the template
with no data. Instead, if the requested Page doesn't exist, it should
redirect the client to the edit Page so the content may be created:
</p>
{{code "doc/articles/wiki/part3-errorhandling.go" `/^func viewHandler/` `/^}/`}}
<p>
The <code>http.Redirect</code> function adds an HTTP status code of
<code>http.StatusFound</code> (302) and a <code>Location</code>
header to the HTTP response.
</p>
<h2>Saving Pages</h2>
<p>
The function <code>saveHandler</code> will handle the submission of forms
located on the edit pages. After uncommenting the related line in
<code>main</code>, let's implement the handler:
</p>
{{code "doc/articles/wiki/final-template.go" `/^func saveHandler/` `/^}/`}}
<p>
The page title (provided in the URL) and the form's only field,
<code>Body</code>, are stored in a new <code>Page</code>.
The <code>save()</code> method is then called to write the data to a file,
and the client is redirected to the <code>/view/</code> page.
</p>
<p>
The value returned by <code>FormValue</code> is of type <code>string</code>.
We must convert that value to <code>[]byte</code> before it will fit into
the <code>Page</code> struct. We use <code>[]byte(body)</code> to perform
the conversion.
</p>
<h2>Error handling</h2>
<p>
There are several places in our program where errors are being ignored. This
is bad practice, not least because when an error does occur the program will
have unintended behavior. A better solution is to handle the errors and return
an error message to the user. That way if something does go wrong, the server
will function exactly how we want and the user can be notified.
</p>
<p>
First, let's handle the errors in <code>renderTemplate</code>:
</p>
{{code "doc/articles/wiki/final-parsetemplate.go" `/^func renderTemplate/` `/^}/`}}
<p>
The <code>http.Error</code> function sends a specified HTTP response code
(in this case "Internal Server Error") and error message.
Already the decision to put this in a separate function is paying off.
</p>
<p>
Now let's fix up <code>saveHandler</code>:
</p>
{{code "doc/articles/wiki/part3-errorhandling.go" `/^func saveHandler/` `/^}/`}}
<p>
Any errors that occur during <code>p.save()</code> will be reported
to the user.
</p>
<h2>Template caching</h2>
<p>
There is an inefficiency in this code: <code>renderTemplate</code> calls
<code>ParseFiles</code> every time a page is rendered.
A better approach would be to call <code>ParseFiles</code> once at program
initialization, parsing all templates into a single <code>*Template</code>.
Then we can use the
<a href="/pkg/html/template/#Template.ExecuteTemplate"><code>ExecuteTemplate</code></a>
method to render a specific template.
</p>
<p>
First we create a global variable named <code>templates</code>, and initialize
it with <code>ParseFiles</code>.
</p>
{{code "doc/articles/wiki/final.go" `/var templates/`}}
<p>
The function <code>template.Must</code> is a convenience wrapper that panics
when passed a non-nil <code>error</code> value, and otherwise returns the
<code>*Template</code> unaltered. A panic is appropriate here; if the templates
can't be loaded the only sensible thing to do is exit the program.
</p>
<p>
The <code>ParseFiles</code> function takes any number of string arguments that
identify our template files, and parses those files into templates that are
named after the base file name. If we were to add more templates to our
program, we would add their names to the <code>ParseFiles</code> call's
arguments.
</p>
<p>
We then modify the <code>renderTemplate</code> function to call the
<code>templates.ExecuteTemplate</code> method with the name of the appropriate
template:
</p>
{{code "doc/articles/wiki/final.go" `/func renderTemplate/` `/^}/`}}
<p>
Note that the template name is the template file name, so we must
append <code>".html"</code> to the <code>tmpl</code> argument.
</p>
<h2>Validation</h2>
<p>
As you may have observed, this program has a serious security flaw: a user
can supply an arbitrary path to be read/written on the server. To mitigate
this, we can write a function to validate the title with a regular expression.
</p>
<p>
First, add <code>"regexp"</code> to the <code>import</code> list.
Then we can create a global variable to store our validation
expression:
</p>
{{code "doc/articles/wiki/final-noclosure.go" `/^var validPath/`}}
<p>
The function <code>regexp.MustCompile</code> will parse and compile the
regular expression, and return a <code>regexp.Regexp</code>.
<code>MustCompile</code> is distinct from <code>Compile</code> in that it will
panic if the expression compilation fails, while <code>Compile</code> returns
an <code>error</code> as a second parameter.
</p>
<p>
Now, let's write a function that uses the <code>validPath</code>
expression to validate path and extract the page title:
</p>
{{code "doc/articles/wiki/final-noclosure.go" `/func getTitle/` `/^}/`}}
<p>
If the title is valid, it will be returned along with a <code>nil</code>
error value. If the title is invalid, the function will write a
"404 Not Found" error to the HTTP connection, and return an error to the
handler. To create a new error, we have to import the <code>errors</code>
package.
</p>
<p>
Let's put a call to <code>getTitle</code> in each of the handlers:
</p>
{{code "doc/articles/wiki/final-noclosure.go" `/^func viewHandler/` `/^}/`}}
{{code "doc/articles/wiki/final-noclosure.go" `/^func editHandler/` `/^}/`}}
{{code "doc/articles/wiki/final-noclosure.go" `/^func saveHandler/` `/^}/`}}
<h2>Introducing Function Literals and Closures</h2>
<p>
Catching the error condition in each handler introduces a lot of repeated code.
What if we could wrap each of the handlers in a function that does this
validation and error checking? Go's
<a href="/ref/spec#Function_literals">function
literals</a> provide a powerful means of abstracting functionality
that can help us here.
</p>
<p>
First, we re-write the function definition of each of the handlers to accept
a title string:
</p>
<pre>
func viewHandler(w http.ResponseWriter, r *http.Request, title string)
func editHandler(w http.ResponseWriter, r *http.Request, title string)
func saveHandler(w http.ResponseWriter, r *http.Request, title string)
</pre>
<p>
Now let's define a wrapper function that <i>takes a function of the above
type</i>, and returns a function of type <code>http.HandlerFunc</code>
(suitable to be passed to the function <code>http.HandleFunc</code>):
</p>
<pre>
func makeHandler(fn func (http.ResponseWriter, *http.Request, string)) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
// Here we will extract the page title from the Request,
// and call the provided handler 'fn'
}
}
</pre>
<p>
The returned function is called a closure because it encloses values defined
outside of it. In this case, the variable <code>fn</code> (the single argument
to <code>makeHandler</code>) is enclosed by the closure. The variable
<code>fn</code> will be one of our save, edit, or view handlers.
</p>
<p>
Now we can take the code from <code>getTitle</code> and use it here
(with some minor modifications):
</p>
{{code "doc/articles/wiki/final.go" `/func makeHandler/` `/^}/`}}
<p>
The closure returned by <code>makeHandler</code> is a function that takes
an <code>http.ResponseWriter</code> and <code>http.Request</code> (in other
words, an <code>http.HandlerFunc</code>).
The closure extracts the <code>title</code> from the request path, and
validates it with the <code>validPath</code> regexp. If the
<code>title</code> is invalid, an error will be written to the
<code>ResponseWriter</code> using the <code>http.NotFound</code> function.
If the <code>title</code> is valid, the enclosed handler function
<code>fn</code> will be called with the <code>ResponseWriter</code>,
<code>Request</code>, and <code>title</code> as arguments.
</p>
<p>
Now we can wrap the handler functions with <code>makeHandler</code> in
<code>main</code>, before they are registered with the <code>http</code>
package:
</p>
{{code "doc/articles/wiki/final.go" `/func main/` `/^}/`}}
<p>
Finally we remove the calls to <code>getTitle</code> from the handler functions,
making them much simpler:
</p>
{{code "doc/articles/wiki/final.go" `/^func viewHandler/` `/^}/`}}
{{code "doc/articles/wiki/final.go" `/^func editHandler/` `/^}/`}}
{{code "doc/articles/wiki/final.go" `/^func saveHandler/` `/^}/`}}
<h2>Try it out!</h2>
<p>
<a href="final.go">Click here to view the final code listing.</a>
</p>
<p>
Recompile the code, and run the app:
</p>
<pre>
$ go build wiki.go
$ ./wiki
</pre>
<p>
Visiting <a href="http://localhost:8080/view/ANewPage">http://localhost:8080/view/ANewPage</a>
should present you with the page edit form. You should then be able to
enter some text, click 'Save', and be redirected to the newly created page.
</p>
<h2>Other tasks</h2>
<p>
Here are some simple tasks you might want to tackle on your own:
</p>
<ul>
<li>Store templates in <code>tmpl/</code> and page data in <code>data/</code>.
<li>Add a handler to make the web root redirect to
<code>/view/FrontPage</code>.</li>
<li>Spruce up the page templates by making them valid HTML and adding some
CSS rules.</li>
<li>Implement inter-page linking by converting instances of
<code>[PageName]</code> to <br>
<code>&lt;a href="/view/PageName"&gt;PageName&lt;/a&gt;</code>.
(hint: you could use <code>regexp.ReplaceAllFunc</code> to do this)
</li>
</ul>