A History of Releasing Julia Packages

Tagging, releasing, registering... the same, but not

There’s an unfortunate confusion in the Julia community when it comes to making your code available to others in a registry such as General, and it’s mostly due to ambiguity in the relevant terminology. We talk about tagging new versions, making new releases, and registering packages all the time, and no one is ever sure what exactly we mean. Are we pushing a Git tag? Making a GitHub release? Adding a version to the registry? All of the above? Are these even different things? As the developer of TagBot and a registry maintainer (albeit a rarely active one thanks to the excellent automation work that has been put into it), I have seen this confusion many times, and played my own part in causing it, too. Hopefully I can offer some context on how things came to be this way, and maybe it’ll help clear up the confusion.

Definitions

Before we start, it’s helpful to know the true meaning of these ambiguous words:

Tag: A Git tag. Tags don’t depend on GitHub, and they have no significance to the Julia package manager.
Release: A GitHub release. A GitHub release is basically a Git tag with some GitHub-specific metadata, and also means nothing to Julia.
Registration: A version of a Julia package in a Julia registry. These are what the Julia package manager cares about.

Julia and GitHub

Julia and GitHub have always been tightly intertwined. Development of the language and most of its third-party packages happens there, and the mechanism for registering packages mostly assumes that you’re using GitHub, too. But that could be said about lots of language ecosystems out there! To understand why Git tags and GitHub releases are often discussed synonymously with Julia package registrations, we need to look back in time to the pre-1.0 days of METADATA.jl and AttoBot.

Pre-1.0: AttoBot

Note: I started using Julia at version 0.5; I have no information on how things worked prior.

Most languages have a central index of package metadata, and Julia is no different. Before Julia 1.0, that index was METADATA.jl, a Git repository containing metadata for all Julia packages. This repository wasn’t easy to contribute to manually, so AttoBot was built to make automated updates on your behalf. AttoBot worked like this:

You make a GitHub release on your repository
AttoBot makes a matching pull request on METADATA.jl
If the pull request can’t go through, you delete the GitHub release, delete the Git tag, make required changes, and then recreate the GitHub release

Users of this workflow often referred to adding a new version to METADATA.jl as “tagging” or “releasing” the package, because it described the user action perfectly—creating a GitHub release. Even if you called it tagging, anyone who knew a little bit about Git knew that their GitHub release had a Git tag underneath. It’s also important to remember that there was no Project.toml in these days, so the version of the package was dictated entirely by the version you chose for your GitHub release.

Post-1.0: Registrator

Now let’s fast-forward to present-day Julia, with the new package manager, the General registry to replace METADATA.jl, and Registrator to replace AttoBot. The registry’s internal structure is even more complicated than METADATA.jl’s, so Registrator’s job is, like AttoBot, to abstract that complexity away from package developers. Instead of listening to GitHub release events, Registrator responds to trigger comments on commits or issues, then creates PRs to the registry much like AttoBot did.

The registration pipeline looks something like this:

You update the version field in your package’s Project.toml
You comment @JuliaRegistrator register on the commit
Registrator creates a registry PR and it eventually gets merged
If the PR can’t go through, you make required changes, then retrigger Registrator on the new commit

An important change that came with the new registry format was that registered versions were no longer dependent on underlying Git tags or GitHub releases. But even though packages no longer required Git tags, many people continued to use them. Among the main benefits of maintaining such tags is that you can easily browse the code at a given version. However, without automation, many incorrect tags began appearing. The most common scenarios were:

After a new version was registered, a GitHub release was created on the master branch. However, new commits had been made since the registration, and so the tag pointed at the wrong commit.
A GitHub release was made before registering the new version, but for some reason a change was required before the registration could continue. The user makes a commit and updates the registration PR, but the GitHub release remains on the old commit.

In both cases, one of two things happened:

The existing tag remained, pointing at the wrong commit
The existing tag was manually removed and recreated to point at the right commit

Both of these were bad. The reason for the first is obvious, but for the second it’s less so. It’s a subjective but somewhat commonly-held opinion that Git tags should be immutable, that is to say: your v1.0.0 tag should not point at one commit on one day, and another commit on the next. If you’re constantly moving your tags around, it becomes harder to reproduce things in your Git repository. This is one of the main reasons that Registrator was created rather than simply porting AttoBot to the new registry format.

TagBot

Seeing a need for automated Git tags that correctly matched the Julia registry, I wrote TagBot. And naturally, it confused everybody. TagBot works by listening for new version registrations, then creating Git tags and GitHub releases for them. It is completely autonomous, and supplements the registration process rather than replacing any part of it. Unfortunately, due to the previously well-deserved understanding of tagging a package as making a GitHub release, people misinterpreted the introduction of TagBot as the reincarnation of AttoBot.

Many people thought that TagBot was a replacement for Registrator, and that by manually creating a GitHub release, TagBot would update their Project.toml and create a registry PR for them just like AttoBot had… despite the documentation stating explicitly that this was not the case. It turns out that people just really liked that old workflow, and for good reason—it was super easy and convenient. But as we mentioned, that workflow had fundamental problems, so TagBot will never work this way. These days, TagBot usage is pretty widespread, and it’s thankfully quite rare for me to see this misconception.

Moving Forward

At this point, the registry tooling is generally stable, and most people seem to have gotten the hang of things. That being said, I still think that there are some rough edges that could be improved:

Agree on the meaning of “tagging”, “releasing”, and “registering”. Ideally, we’d use “tagging” to refer to Git tags, “releasing” for GitHub releases, and “registering” for Julia-specific package publishing. But realistically, I think it’s inevitable that we continue to use all three to mean the latter. As usage of TagBot grows, hopefully we can stop worrying about Git tags and GitHub releases altogether.
Expand support for GitHub alternatives. I added support for GitLab some time ago, but that’s actually been broken for a while now. I think it would be worth reworking the Registrator web interface to rely more on Git and less on platform-specific APIs. Even further down the line, TagBot could be patched to support these other platforms, too.

Hopefully this helps to explain how the Julia package registration process evolved, why it is the way that it is today, and where it can go next.

2020/06/11

julia