A Go project starts life with a package and no versions or dependencies. The dependencies are quickly added. We use the package github.com/gorilla/mux for the HTTP route (
go get github.com/gorilla/mux). Because we are still processing XML etree, we also need to use go get github.com/beevik/etree. The project is picking up speed and soon a new employee is joining us. Not long after, we use Go 1.0.3 in production and a whole team is developing version 1.1.0.
But what’s going on now? Suddenly various tests fail – the XML processing doesn’t work anymore! Nobody changed anything, did they? Well, that’s not quite true. None of the team changed anything, but the developers of etree did. The last build for Go downloaded the dependency on etree via
go get github.com/beevik/etree and got a received a changed etree package, because go get always downloads the current state of the master branch of a package. With the latest state of etree, the tests fail.
And now? There are two remaining possibilities: either wait for a fix from etree, or use the previous version of etree by packing etree into the vendor-directory as shown below and checking it in. Go 1.11 adds a new third option, Go modules.
The first Go modules
So, now we have to go back to the beginning and restart our Go project again, this time with Go modules. A Go module is one or more Go packages with one version and dependencies to other modules. Each module has a unique module path. In Go 1.11, modules are still experimental and have to be activated with the environment variable
We start our project with a package (github.com/red6/gomod) in
GOPATH. To turn this project into a Go module, we have to initialize it with
go mod init. If the package is in
GOPATH, Go takes the import path of the package github.com/red6/gomod as the module path and creates the file go.mod. The module path and all of its dependencies are deposited into the file go.mod. So far, the go.mod file contains only module github.com/red6/gomod.
Now, the first dependency comes on the package github.com/gorilla/mux, as usual via
go get github.com/gorilla/mux. But which version did we get? In the go.mod, it says version 1.6.2. But why? Go does determine the highest version of the package as the highest tag after the semantic versioning. Alternatively, you can explicitly specify a version (see table).
Go has now created another file go.sum, with a hash of the package and hashes of all transitive dependencies of the package. With this hash, Go checks that each future build for the specified version of the package also uses the same code. The downloaded dependencies are stored centrally by Go under $GOPATH/pkg/mod and are not checked in. The Gorilla Mux package can be found there at $GOPATHfirstname.lastname@example.org.
Now, we get the etree package via
go get github.com/beevik/etree. The determined version 1.0.1 is automatically entered in go.mod and go.sum. Check in the configuration files for go.mod and go.sum; our first Go module is ready. For a better understanding of Go modules, we’ll take a closer look at GOPATH, the precursor to modules.
The path to Go Modules
In the beginning of Go in 2012, there was
GOPATH. It quickly become clear that GOPATH wasn’t sufficient for projects that needed repeatable builds over a long period of time. A build is repeatable if it always returns the same result, no matter if the build is run today or in a year. A repeatable build is only possible if the build uses exactly the same package versions every single time. Soon, there were glide approaches to managing package versions with godep or later with glide.
Go 1.5 experimentally introduced the vendor directory. During build, Go does check whether there is a vendor directory in the project. If this is the case, packages from the vendor directory have priority over packages from
GOPATH. This way, all packages required by a project can be checked into the vendor directory. This ensures that the same version of the dependencies is always used and the build is independent of whether GitHub is currently unavailable or an author has deleted his package. The vendor directory was adopted into the standard with Go 1.6.
Meanwhile, all the tools for managing package dependencies are built upon the vendor directory. Package versions are stored there, even for large projects such as Kubernetes, Docker, or CockroachDB. But these large projects also showcase the disadvantages: for CockroachDB, the vendor directory is 58 MB; for Kubernetes, it is 209 MB with 818 package dependencies. The packages must be downloaded and compiled from the directory with each build. So, changes to the dependencies strongly inflate the Git repository.
Even though the vendor directory is important and solves some problems, it does not yet meet the requirements of large projects with very long runtimes. But don’t worry, the directory is supported until Go 2 and probably even beyond that point.
In 2017, the development of dep started as an official experiment of the Go Package Management Committee. If the experiment was successful, then dep was to become the official solution for the package management. dep situates itself on existing tools within the Go community, like glide, and adopts the mechanisms of other package managers, such as cargo or bundler.
dep does use two configuration files – Gopkg.toml with the dependencies, and Gopgk.lock with the package hashes and versions. The dependencies are stored in the vendor directory. dep allows for very flexible and powerful version number specifications for dependencies based on semantic versioning. Valid version specifications in dep are things like 1.5-1.7.0 (i.e. >= 1.5 and <= 1.7.0) and ~1.6.0 (i.e. >=1.6 and <1.7.0). In dep, the version can be specified more flexibly and powerful than in Go modules, which only allows the specification of a minimally required version.
The Go team followed the experiment closely, but it was ultimately not included in the Go standard as some members were not convinced. After all, the old Go principle states “if in doubt, leave it out.” However, the Go team continued to work intensively on the subject of package management. One major proponent was Russ Cox, who is responsible for the essential concepts for Go modules and the current implementation.
In essence, Russ Cox said that dep’s flexibility and power are difficult for users to manage. In practice, they do not deliver better results than the minimum version number approach of Go modules. This flexibility requires a very complex implementation of the package management tool itself (keyword version SAT), at the expense of maintainability and performance. A comparison of Go modules and dep would go beyond the scope of what’s covered here, but the most important points can be found in this Russ Cox Twitter thread.
However, one thing is certain: the cooperation between the Go community and Google’s Go team did not work out well. For a long time, many major points of criticism were not communicated to the dep and Go community. On the other hand, the Go modules were devised in the closed off rooms of the Googleplex and the community was not involved enough. The Go team also saw this and promised a better effort in the future.
Advanced Go modules
We’ve learned the basics of Go modules. Now, we go into deeper detail. So far, the go.mod only contains dependencies on other modules with require. With exclude certain versions of other modules can be excluded (see Listing 1). In an emergency, a replacement of a module can also be used to replace a module with another module (see Listing 1). With the specification replace “bad/thing” v1.3.0 => “good/thing” v1.5.2, we replace the module bad/thing in version v1.3.0 with the module good/thing in version v1.5.2.
For modules that are direct dependencies, this makes no sense because there we can change the require dependency directly. With indirect dependencies, we can’t change anything – a replace is a last resort. With require, exclude, and replace of certain module versions, we learned all the possibilities of the module configuration. Now, we will learn more about why this is enough and what is happening internally. But first, a little excursion about versioning with Go modules.
module example.com/hello require ( golang.org/x/text v0.3.0 gopkg.in/yaml.v2 v2.1.0 ) exclude github.com/go-stack/stack v1.6.0 replace ( github.com/go-stack/stack v1.4.0 => ../stack/ golang.org/x/text => github.com/pkg/errors v0.8.0 )
Go modules rely on semantic versioning. After this, a version number consists of three digits:
Major.Minor.Patch, for example 1.5.2. In a new major version, everything can change and a new patch version functionality can be added. The patch version is only intended for bugfixes without changes for users. Changes of minor or patch version are therefore always downward compatible without any effects on users. Major version 0 has a special status, so everything can still change there.
Go modules and the packages of a module are versioned after semantic versioning. Different major versions do not have to be compatible with each other. For Go, they behave like different packages. The major version becomes part of the import path of a package, for example red6/payment/v2. By convention, major versions 0 and 1 are omitted. Therefore, in the example above, the import path github.com/gorilla/mux for version 1.6.2 remains the same. For major versions, there are subdirectories or branches in Git. A good example is the package rsc.io/quote by Russ Cox, from a demo for Go Modules. It contains a Branch v2 for Major Version 2 and in the Master Branch a subdirectory v3 for Major Version 3. We get the dependency on v2 via
go get rsc.io/quote/v2, on Major Version 3 with rsc.io/quote/v3.
A project can use different major versions of a package without any problems. This is necessary for large projects that need to extend a migration to a new major version over a longer period of time. In the future, there will also be tools that make it easier to migrate to a new major version by changing the package path from rsc.io/quote/v2 to rsc.io/quote/v3 throughout the code.
But what about Go packages that don’t have a tagged version yet? For them, Go creates a pseudo version number from timestamp and commit hash, for example
At one point, Go goes beyond the promise of semantic versioning. Go requires that packages with the same import path remain compatible with each other. This is what Russ Cox calls the Import Compatibility Rule and is one of the principles of Go modules. If a new module version still contains a package with the same import path as a previous version, the package must behave the same.
Minimal version selection
An important design decision of the Go modules is to specify the minimum required version of the required module for dependencies. And only this one. If our module needs the red6/payment package in version 1.7.0, we write 1.7.0 into it. Now a bugfix of red6/payment will be released under version 1.7.1. Which version does our build use now? Still version 1.7.0, because that’s what the go.mod says. To use the bugfix, we have to explicitly add version 1.7.1 to the file. It is not possible to write the dependency in such a way that the newest version after semantic versioning is always used. Russ Cox calls this Minimal Version Selection.
Let’s make it a bit more complicated. Suppose there is a module red6/paypal in version 1.0.1, which needs red6/payment in version 1.7.0, and a module red6/paydirekt in version 1.3.0, which needs red6/payment in version 1.6.3. So far , there’s no problem; both modules can use the specified version of red6/payment, one module 1.7.0 and the other module 1.6.3. But what if we start a new project red6/checkout that uses both red6/paypal in version 1.0.1 and red6/paydirekt in version 1.3.0?
The import path of red6/payment in versions 1.6.3 and 1.7.0 is the same, so the build cannot use both versions, but must decide for one. So nothing is fixed: Go modules cannot use the specified version of red6/payment for both modules. To decide this, semantic versioning is used again. This leads to deciding on version 1.7.0, because both can work with it. Version 1.7.0 for the red6/payment package will then also be written into the go.sum of the new red6/checkout project. This will be checked in again, so that the build can also be repeated in the future.
Main commands for Go modules
Go modules brings more commands to make dependencies easier to manage.
go get -u, all dependencies can be updated as usual. The new variant is
go get -u=patch, which upgrades all dependencies to the latest patch version.
go mod init
This command initializes a module in the current directory. If the current directory is available, the module path is initialized with the current path in the
GOPATH. Optionally, a module path can be specified via
go mod init red6/newmod.
go mod tidy
This command resets the module configuration to the source code. Dependencies that are no longer needed are removed, transitive dependencies are updated and cleaned up.
go mod graph
go mod graph displays the dependency graph on the command line. Graphical tools will follow.
Minimal version selection is simple and understandable, but new and very different from Rust’s cargo or Ruby’s bundler. Minimal Version Selection serves the principle of repetition. The principle of repetition means that the result of a build of a certain version must never change. A build must therefore always be repeatable and produce the same result, regardless of whether it is done today or in a year’s time. This provides stability and reliability in development. The Go team and Russ Cox in particular have put a lot of work into this concept. Cox has analyzed, collected data, compared algorithms, and tried many option. Cox’s blog entries on various aspects of Go modules are very extensive and an absolute must for anyone who wants to know more about this topic.
There remains one last principle of Go modules: cooperation.We all have to maintain the ecosystem of the Go packages together. The community cannot and should not avoid agreeing on tools without full cooperation from the entire Go community. We are the community, so we tasked with it making it better. Besides the Go language itself, the package ecosystem is the most important factor for Go’s success.
In time, we will see whether the experiment of the Go modules is successful. However, what’s important is the feedback from the community. We must tell the Go team what works well and what’s missing. Then, Go modules will become a success and will be adopted into Go 1.12 as a standard tool. Dependency management is one of the most important challenges for Go, according to the results of the Go Survey 2017, so solving this issue has important ramifications. For Go 2, there are still enough exciting topics for debate like generics and error handling.