Channels ▼
RSS

Go Tutorial: Object Orientation and Go's Special Data Types


Using bufio's reader and writer as we have done here means that we can work with convenient high-level string values, completely insulated from the raw bytes that represent the text on disk. And, of course, thanks to our deferred anonymous function, we know that any buffered bytes are written to the writer when the americanise() function returns, providing that no error has occurred.

func makeReplacerFunction(file string) (func(string) string, error){
rawBytes, err := ioutil.ReadFile(file)
if err != nil {
    return nil, err
}
text := string(rawBytes)

usForBritish := make(map[string]string)
lines := strings.Split(text, "\n")
for _, line := range lines {
    fields := strings.Fields(line)
    if len(fields) == 2 {
        usForBritish[fields[0]] = fields[1]
    }
}

return func(word string) string {
if usWord, found := usForBritish[word]; found {
return usWord
}
return word
}, nil
}

The makeReplacerFunction() takes the name of a file containing original and replacement strings and returns a function that, given an original string, returns its replacement, along with an error value. It expects the file to be a UTF-8 encoded text file with one whitespace-separated original and replacement word per line. In addition to the bufio package's readers and writers, Go's io/ioutil package provides some high-level convenience functions, including the ioutil.ReadFile() function used here. This function reads and returns the entire file's contents as raw bytes (in a []byte) and an error. As usual, if the error is not nil, we immediately return it to the caller — along with a nil replacer function. If we read the bytes okay, we convert them to a string using a Go conversion of form type(variable). Converting UTF-8 bytes to a string is very cheap because Go's strings use the UTF-8 encoding internally.

The replacer function we want to create must accept a string and return a corresponding string, so what we need is a function that uses some kind of lookup table. Go's built-in map collection data type is ideal for this purpose. A map holds key–value pairs with very fast lookup by key. So here we will store British words as keys and their U.S. counterparts as values.

Go's map, slice, and channel types are created using the built-in make() function. This creates a value of the specified type and returns a reference to it. The reference can be passed around (e.g., to other functions) and any changes made to the referred-to value are visible to all the code that accesses it. Here we have created an empty map called usForBritish, with string keys and string values.

With the map in place, we then split the file's text (which is in the form of a single long string) into lines, using the strings.Split() function. This function takes a string to split and a separator string to split on and does as many splits as possible. (If we want to limit the number of splits, we can use the strings.SplitN() function.) The iteration over the lines uses a for loop syntax that we haven't seen before, this time using a range clause. This form can be conveniently used to iterate over a map's keys and values, over a communication channel's elements, or — as here — over a slice's (or array's) elements. When used on a slice (or array), the slice index and the element at that index are returned on each iteration, starting at index 0 (if the slice is nonempty). In this example, we use the loop to iterate over all the lines, but since we don't care about the index of each line, we assign it to the blank identifier (_), which discards it.

We need to split each line into two: the original string and the replacement string. We could use the strings.Split() function, but that would require us to specify an exact separator string, say, "", which might fail on a hand-edited file where sometimes users accidentally put in more than one space or sometimes use tabs. Fortunately, Go provides the strings.Fields() function, which splits the string it is given on whitespace and is therefore much more forgiving of human-edited text. If the fields variable (of type []string) has exactly two elements, we insert the corresponding key–value pair into the map. Once the map is populated, we are ready to create the replacer function that we will return to the caller.

We create the replacer function as an anonymous function given as an argument to the return statement — along with a nil error value. (Of course, we could have been less succinct and assigned the anonymous function to a variable and returned the variable.) The function has the exact signature required by the regexp.Regexp.ReplaceAllStringFunc() method that it will be passed to. Inside the anonymous replacer function, all we do is look up the given word. If we access a map element with one variable on the left-hand side, that variable is set to the corresponding value — or to the value type's zero value if the given key isn't in the map. If the map value type's zero value is a legitimate value, then how can we tell if a given key is in the map? Go provides a syntax for this case — and that is generally useful if we simply want to know whether a particular key is in the map — which is to put two variables on the left-hand side, the first to accept the value and the second to accept a bool indicating if the key was found. In this example, we use this second form inside an if statement that has a simple statement (a short variable declaration), and a condition (the found Boolean). So we retrieve the usWord (which will be an empty string if the given word isn't a key in the map), and a found flag of type bool. If the British word was found, we return the U.S. equivalent; otherwise, we simply return the original word unchanged.

There is a subtlety in the makeReplacerFunction() function that may not be immediately apparent. In the anonymous function created inside it, we access the usForBritish map, yet this map was created outside the anonymous function. This works because Go supports closures. A closure is a function that "captures" some external state — for example, the state of the function it is created inside, or at least any part of that state that the closure accesses. So here, the anonymous function that is created inside the makeReplacerFunction() is a closure that has captured the usForBritish map.

Another subtlety is that the usForBritish map is a local variable and yet we will be accessing it outside the function in which it is declared. It is perfectly fine to return local variables in Go. Even if they are references or pointers, Go won't delete them while they are in use and will garbage-collect them when they are finished with (when every variable that holds, refers, or points to them has gone out of scope).

This article has shown some basic low-level and high-level file-handling functionality using os.Open(), os.Create(), and ioutil.ReadFile(). Go's built-in collection types — slices and maps — largely obviate the need for custom collection types while providing extremely good performance and great convenience. Go's treatment of functions as first-class values in their own right and its support for closures makes it possible to use some advanced and very useful programming idioms. And Go's defer statement makes it straightforward to avoid resource leakage.


The first article in this series is Getting Going with Go

This article is adapted from the author's recent book, Programming in Go. The source code from the book should be consulted for the full source code of these examples. The main example is listed under "americanise."


Mark Summerfield is an independent trainer, consultant, and writer specializing in Go, Python, C++, and Qt.


Related Reading

Getting Going with Go

RESTful Web Service in Go Powered by the Google App Engine

A Brief Tour of the Go Standard Library

Why Not Go?


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video