Table of Contents
In Go there are both arrays and slices. This can be confusing at first, but once you get used to it you will love it. Trust me.
There are many differences between slices and arrays, but the primary one we want to focus on in this article is that the size of an array is part of its type, whereas slices can have a dynamic size because they are wrappers around arrays.
Quick pop quiz – what does the following code output?
vals := make([]int, 5)
for i := 0; i < 5; i++ {
vals = append(vals, i)
}
fmt.Println(vals)
If you guessed [0 0 0 0 0 0 1 2 3 4]
you are correct.
Wait, what? Why isn’t it [0 1 2 3 4]
?
Don’t worry if you got the pop quiz wrong. This is a fairly common mistake when transitioning into Go and in this post we are going to cover both why the output isn’t what you expected along with how to utilize the nuances of Go to make your code more efficient.
Slices vs Arrays
In Go there are both arrays and slices. This can be confusing at first, but once you get used to it you will love it. Trust me.
There are many differences between slices and arrays, but the primary one we want to focus on in this article is that the size of an array is part of its type, whereas slices can have a dynamic size because they are wrappers around arrays.
What does this mean in practice? Well, let’s say we have the array var a [10]int
. This array has a fixed size and that can’t be changed. If we were to call len(a)
it would always return 10, because that size is part of the type. As a result, if you suddenly need more than 10 items in your array you have to create a new object with an entirely different type, such as var b [11]int
, and then copy all of your values from a
over to b
.
While having arrays with set sizes is valuable in specific cases, generally speaking this isn’t what developers want. Instead, they want to work with something similar to an array in Go, but with the ability to grow over time. One crude way to do this would be to create an array that is much bigger than it needs to be and then to treat a subset of the array as your array. An example of this is shown in the code below.
var vals [20]int
for i := 0; i < 5; i++ {
vals[i] = i * i
}
subsetLen := 5
fmt.Println("The subset of our array has a length of:", subsetLen)
// Add a new item to our array
vals[subsetLen] = 123
subsetLen++
fmt.Println("The subset of our array has a length of:", subsetLen)
With this code we have an array with a set size of 20, but because we are only using a subset our code can pretend that the length of the array is 5, and then 6 after we add a new item to our array.
This is (very roughly speaking) how slices work. They wrap an array with a set size, much like our array in the previous example has a set size of 20.
They also keep track of the subset of the array that is available for your program to use – this is the length
attribute, and it is similar to the subsetLen
variable in the previous example.
Finally, a slice also has a capacity
, which is similar to the total length of our array (20) in the previous example. This is useful because it tells you how large your subset can grow before it will no longer fit in the array that is backing the slice. When this does happen, a new array will need to be allocated, but all of this logic is hidden behind the append
function.
In short, combining slices with the append
function gives us a type that is very similar to arrays, but is capable of growing over time to handle more elements.
Let’s look at the previous example again, but this time we will use a slice instead of an array.
var vals []int
for i := 0; i < 5; i++ {
vals = append(vals, i)
fmt.Println("The length of our slice is:", len(vals))
fmt.Println("The capacity of our slice is:", cap(vals))
}
// Add a new item to our array
vals = append(vals, 123)
fmt.Println("The length of our slice is:", len(vals))
fmt.Println("The capacity of our slice is:", cap(vals))
// Accessing items is the same as an array
fmt.Println(vals[5])
fmt.Println(vals[2])
We can still access elements in our slice just like we would arrays, but by using a slice and the append
function we no longer have to think about the size of the backing array. We are still able to figure these things out by using the len
and cap
functions, but we don’t have to worry too much about them. Neat, right?
Back to the pop quiz
With that in mind, let’s look back at our pop quiz code to see what went wrong.
vals := make([]int, 5)
for i := 0; i < 5; i++ {
vals = append(vals, i)
}
fmt.Println(vals)
When calling make
we are permitted to pass in up to 3 arguments. The first is the type that we are allocating, the second is the length
of the type, and the third is the capacity
of the type (this parameter is optional).
By passing in the arguments make([]int, 5)
we are telling our program that we want to create a slice with a length of 5, and the capacity is defaulted to the length provided – 5 in this instance.
While this might seem like what we wanted at first, the important distinction here is that we told our slice that we wanted to set both the length
and capacity
to 5, and then we proceeded to call the append
function which assumes you want to add a new element after the initial 5, so it will increase the capacity and start adding new elements at the end of the slice.
You can actually see the capacity changing if you add a Println()
statement to your code.
vals := make([]int, 5)
fmt.Println("Capacity was:", cap(vals))
for i := 0; i < 5; i++ {
vals = append(vals, i)
fmt.Println("Capacity is now:", cap(vals))
}
fmt.Println(vals)
As a result, we end up getting the output [0 0 0 0 0 0 1 2 3 4]
instead of the desired [0 1 2 3 4]
.
How do we fix it? Well, there are several ways to do this, so we are going to cover two of them and you can pick whichever makes the most sense in your situation.
Write directly to indexes instead of using append
The first fix is to leave the make
call unchanged and explicitly state the index that you want to set each element to. Doing this, we would get the following code:
vals := make([]int, 5)
for i := 0; i < 5; i++ {
vals[i] = i
}
fmt.Println(vals)
In this case the value we are setting happens to be the same as the index we want to use, but you can also keep track of the index independently.
For example, if you wanted to get the keys of a map you could use the following code.
package main
import "fmt"
func main() {
fmt.Println(keys(map[string]struct{}{
"dog": struct{}{},
"cat": struct{}{},
}))
}
func keys(m map[string]struct{}) []string {
ret := make([]string, len(m))
i := 0
for key := range m {
ret[i] = key
i++
}
return ret
}
This works well because we know that the exact length of the slice we return will be the same as the length of the map, so we can initialize our slice with that length and then assign each element to an appropriate index. The downside to this approach is that we have to keep track of i
so that we know what index to place every value in.
This leads us to the second approach we are going to cover…
Use 0
as your length and specify your capacity instead
Rather than keeping track of which index we want to add our values to, we can instead update our make
call and provide it with two arguments after the slice type. The first, the length of our new slice, will be set to 0
, as we haven’t added any new elements to our slice. The second, the capacity of our new slice, will be set to the length of the map parameter because we know that our slice will eventually have that many strings added to it.
This will still construct the same array behind the scenes as the previous example, but now when we call append
it will know to place items at the start of our slice because the length of the slice is 0.
package main
import "fmt"
func main() {
fmt.Println(keys(map[string]struct{}{
"dog": struct{}{},
"cat": struct{}{},
}))
}
func keys(m map[string]struct{}) []string {
ret := make([]string, 0, len(m))
for key := range m {
ret = append(ret, key)
}
return ret
}
Why do we bother with capacity at all if append
handles it?
The next thing you might be asking is, “Why are we even telling our program a capacity if the append
function can handle increasing the capacity of my slice for me?”
The truth is, in most cases you don’t need to worry about this too much. If it makes your code significantly more complicated, just initialize your slice with var vals []int
and let the append
function handle the heavy lifting for you.
But this case is different. It isn’t an instance where declaring the capacity is difficult; In fact, it is actually quite easy to determine what the final capacity of our slice needs to be because we know it will map directly to the provided map. As a result, we can declare the capacity of our slice when we initialize it and save our program from needing to perform unnecessary memory allocations.
If you want to see what the extra memory allocations look like, run the following code on the Go Playground. Every time capacity increases our program needed to do another memory allocation.
package main
import "fmt"
func main() {
fmt.Println(keys(map[string]struct{}{
"dog": struct{}{},
"cat": struct{}{},
"mouse": struct{}{},
"wolf": struct{}{},
"alligator": struct{}{},
}))
}
func keys(m map[string]struct{}) []string {
var ret []string
fmt.Println(cap(ret))
for key := range m {
ret = append(ret, key)
fmt.Println(cap(ret))
}
return ret
}
Now compare this to the same code but with a predefined capacity.
package main
import "fmt"
func main() {
fmt.Println(keys(map[string]struct{}{
"dog": struct{}{},
"cat": struct{}{},
"mouse": struct{}{},
"wolf": struct{}{},
"alligator": struct{}{},
}))
}
func keys(m map[string]struct{}) []string {
ret := make([]string, 0, len(m))
fmt.Println(cap(ret))
for key := range m {
ret = append(ret, key)
fmt.Println(cap(ret))
}
return ret
}
In the first code sample our capacity starts at 0
, and then increases to 1
, 2
, 4
, and then finally 8
, meaning we had to allocate a new array 5 different times, and on top of that the final array used to back our slice has a capacity of 8
, which is bigger than we ultimately needed.
On the other hand, our second sample starts and ends with the same capacity (5
) and only needs to allocate it once at the start of the keys()
function. We also avoid wasting any extra memory and return a slice with the perfect size array backing it.
Don’t over-optimize
As I said before, I typically wouldn’t encourage anyone to worry about minor optimizations like this, but in cases where it is really obvious what the final size should be I strongly encourage you to try to set an appropriate capacity or length for your slices.
Not only does it help improve the performance of your application, but it can also help clarify your code a bit by explicitly stating the relationship between the size of your input and the size of your output.
In summary…
This article is not meant to be an exhaustive discussion on the differences between slices or arrays, but instead is meant to serve as a brief introduction into how capacity and length affect your slices, and what purpose they serve in the grand scheme of things.
For further reading, I highly recommend the following articles from the Go Blog: