stefan's blag and stuff

Blog – 2017-09-12 – About git tags

What is a tag in git, the stupid content tracker? For short: It's an ordinary file in the git directory .git/refs/tags/ (the refs/tags/ namespace) containing a SHA1 checksum of a git object and is listed by the command git tag in alphabetical order. [1]

But that's not everything you can say or write about tags. This blogpost covers some bits and piece that I have learned along the way and was sometimes surprising to me. So if you are curious, read on:

Two types of tags

There are two different types of tags:

You can distinguish these two types of tags by executing git show <tag name>. For an annotated tag you see:

$ git show v4.9
tag v4.9
Tagger: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Dec 11 11:18:02 2016 -0800

Linux 4.9
[...]

commit 69973b830859bc6529a7a0468ba0d80ee5117826 (tag: v4.9)
Author: Linus Torvalds <torvalds@linux-foundation.org>
[...]

Git prints the information from the tag object before printing the git object that was tagged. Here Linus has tagged the commit object 69973b830.

Annotated tags are the normal form of a git tag. Git creates an extra tag object with additional information like the Name, Email, Date and a Message. It also supports GPG singing it. Only annotated tags are search in git describe by default.

For lightweigth tags git show prints the tagged object directly:

$ git show drm-fixes-v4.7-rc1
commit 7fa1d27b638db86516d7b3d8dc1a3576c72ee423 (tag: drm-fixes-v4.7-rc1)
Merge: 79b3c7164c18 157d2c7fad08
Author: Dave Airlie <airlied@redhat.com>
[...]

Here the merge commit object 79b3c7164c18 was tagged by Dave. It's only a temporary tag Dave used in a merge request for Linus.

Tip: To be really sure you can also use plumbing commands to print the git object type directly:

$ git cat-file -t v4.9
tag
$ git cat-file -t drm-fixes-v4.7-rc1
commit

To update the definition in the introduction: What is a real git tag? It's a git tag object including name, email, date and a message and the SHA1 of the tag object is stored in the refs/tags/ namespace.

Conclusion: Always use annotated tags, even for temporary and simple stuff. This avoids certain pitfalls.

What the tag?

In the previous section I have written down the commands to create, show and inspect tags. Different to branches, which point to the latest commit object and are used for development, a tag points to a commit object that is a release version of the software. It pins the source code including the commit history in the lifetime of the project to a specific and immutable state. So tags are all about commit objects and traceability.

Now look at the earliest tags in the linux kernel (As you mostly know the linux kernel history has started before the existence of git):

$ git remote -v
origin  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git (fetch)
origin  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git (push)
$ git tag
v2.6.11
v2.6.11-tree
v2.6.12
v2.6.12-rc2
v2.6.12-rc3
v2.6.12-rc4
v2.6.12-rc5
v2.6.12-rc6
[...]

First question: Where is the tag for v2.6.11-rc1? I don't know, lost in the history? Second question: What is v2.6.11-tree?

$ git show v2.6.11-tree
tag v2.6.11-tree

This is the 2.6.11 tree object.

NOTE! There's no commit for this, since it happened before I started with git.
Eventually we'll import some sort of history, and that should tie this tree
object up to a real commit. In the meantime, this acts as an anchor point for
doing diffs etc under git.
[...]

tree v2.6.11-tree

COPYING
CREDITS
Documentation/
MAINTAINERS
Makefile
[...]

Wait... Normally you would expect a commit XXXXXXXX… line and object after the tag object in the git show output. Here you see a tree object. Git cannot only tag commit objects, it can tag any git object. Just for reference all git object types are: blob, tree, commit and tag.

So let's tag a blob (aka file):

# See 'git help revisions' for the explanation of the 'HEAD:Makefile' syntax.
$ git show HEAD:Makefile
VERSION = 4
PATCHLEVEL = 13
SUBLEVEL = 0
EXTRAVERSION =
NAME = Fearless Coyote

# *DOCUMENTATION*
# To see a list of typical targets execute "make help"
# More info can be located in ./README
# Comments in this file are targeted only to the developer, do not
[...]

$ git cat-file -t HEAD:Makefile
blob

$ git tag -m "tagging a blob" vblob HEAD:Makefile
$ git cat-file -t vblob
tag

$ git show vblob
tag vblob
Tagger: Stefan Lengfeld <xyz@example.com>
Date:   Tue Sep 12 22:35:16 2017 +0200

tagging a blob
VERSION = 4
PATCHLEVEL = 13
SUBLEVEL = 0
EXTRAVERSION =
NAME = Fearless Coyote
[...]

Works! To verify this feature, you can also print the contents of a git tag object:

$ git cat-file -p vblob
object ed65d7278bb315dea0c170419dff7a3996d539a8
type blob
tag vblob
tagger Stefan Lengfeld <xyz@example.com> 1505248516 +0200

tagging a blob

In the above output you see that a git tag object has an extra field to state the type of the tagged object. Since you can reference any object by it's SHA1 checksum in git, it's mostly natural that you can tag every git object/SHA1 checksum.

One last question remains: Can a git tag object be tagged?

Conclusion: What the hack. Or: Git is powerful.

Conclusion 2: Try to break third party git commands. Most of them are built upon the assumption that a tag object always points to a commit object. Hopefully they can deference multiple levels of a tag object that points to tag object that points to a tag object that points to a tag object that ...

Never delete or modify a already pushed tag

Never delete or modify a release tag in a public repository. If you messed up a tag for a release version, it's better (read: 'The only way to fix this situation is ...') to create a new release with a new tag and document the incident in the ReleaseNotes/Changelog.

In the linux kernel development kernel maintainer sometimes uses tags to create pull request for the next higher maintainer or Linus. Of course these tags are allowed to be delete after the code was merged even though they are in public repositories. But never modify or delete a release tag (e.g. v1.1.12).

In contrast to branches git tags have a different semantics when you fetch/pull from a remote repository. Git clearly separates local and remote branches. It prefix remote branches with remotes/<remote-name>/ (See git branch -a) and you can clear non-existing remote branches by executing git fetch --prune locally. That's not true for git tags. While pulling from a remote repository, git merges remote tags into the local tag namespace .git/refs/tags/. Even after the remote side has deleted or modified a tag, the local tag still remains unchanged and nobody will warn you about that situation. Have fun debugging why your coworker as a different source code than you ;-).

Tip: Use git ls-remote <remote-url> to see exactly which tags are available on a remote repository.

Conclusion: Never modified or delete release tags. And never push temporary tags to a git repository that's the main repo of a project. Your are causing pain other developers and to yourself :-)

Tip: Before doing a real push, do a dry run, git push --follow-tags --dry-run, to avoid pushing the wrong stuff.

Some additional links:

Footnotes

[1] As a .git/ folder optimization multiple tags are packed together into the file .git/packed-refs to avoid iterating the folder .git/refs/tags/ each time.