Git Clone of Linux Kernel Source Code – Why It’s Larger Than tar.xz

linux-kernel

When I download the kernel directly as type tar.xz, and untar it, the size is around 1GB. But when I download it via git clone from here, the size is around 7GB. It shows only master branch. Why this huge difference?

Best Answer

The tarball only contains the source code for the specific release of the kernel in the tarball, whereas the git repository (cloned using git clone) contains the history of the kernel going back quite a long time. Even if you only see the master branch when you initially clone it, using the default clone parameters you actually have the full repository locally: git log will show you the full history, git branch --remote will show all the available branches.

If you only want the latest commit, you can use a shallow clone which will be much smaller:

git clone --depth 1 ...

or if you want a specific date,

git clone --shallow-since=...

You can combine that with a specific branch or tag to only download that branch's tip or that tag:

git clone --depth 1 --branch v4.10-rc4 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-4.10-rc4

This produces a tree using 947MiB (and a 159MiB download).

Related Question